Implements Partner v3's statistical rigor requirements at the level of
signature vs. accountant analysis units:
- Script 15 (Hartigan dip test): formal unimodality test via `diptest`.
Result: Firm A cosine UNIMODAL (p=0.17, pure non-hand-signed population);
full-sample cosine MULTIMODAL (p<0.001, mix of two regimes);
accountant-level aggregates MULTIMODAL on both cos and dHash.
- Script 16 (Burgstahler-Dichev / McCrary): discretised Z-score transition
detection. Firm A and full-sample cosine transitions at 0.985; dHash
at 2.0.
- Script 17 (Beta mixture EM + logit-GMM): 2/3-component Beta via EM
with MoM M-step, plus parallel Gaussian mixture on logit transform
as White (1982) robustness check. Beta-3 BIC < Beta-2 BIC at signature
level confirms 2-component is a forced fit -- supporting the pivot
to accountant-level mixture.
- Script 18 (Accountant-level GMM): rebuilds the 2026-04-16 analysis
that was done inline and not saved. BIC-best K=3 with components
matching prior memory almost exactly: C1 (cos=0.983, dh=2.41, 20%,
Deloitte 139/141), C2 (0.954, 6.99, 51%, KPMG/PwC/EY), C3 (0.928,
11.17, 28%, small firms). 2-component natural thresholds:
cos=0.9450, dh=8.10.
- Script 19 (Pixel-identity validation): no human annotation needed.
Uses pixel_identical_to_closest (310 sigs) as gold positive and
Firm A as anchor positive. Confirms Firm A cosine>0.95 = 92.51%
(matches prior 2026-04-08 finding of 92.5%), dual rule
cos>0.95 AND dhash_indep<=8 captures 89.95% of Firm A.
Python deps added: diptest, scikit-learn (installed into venv).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>