Interview evidence from multiple Firm A accountants confirms that MOST
use replication (stamping / firm-level e-signing) but a MINORITY may
still hand-sign. Firm A is therefore a "replication-dominated" population,
not a "pure" one. This framing is consistent with:
- 92.5% of Firm A signatures exceed cosine 0.95 (majority replication)
- The long left tail (~7%) captures the minority hand-signers, not scan
noise or preprocessing artifacts
- Hartigan dip test: Firm A cosine unimodal long-tail (p=0.17)
- Accountant-level GMM: of 180 Firm A accountants, 139 cluster in C1
(high-replication) and 32 in C2 (middle band = minority hand-signers)
Updates docstrings and report text in Scripts 15, 16, 18, 19 to match.
Partner v3's "near-universal non-hand-signing" language corrected.
Script 19 regenerated with the updated text.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements Partner v3's statistical rigor requirements at the level of
signature vs. accountant analysis units:
- Script 15 (Hartigan dip test): formal unimodality test via `diptest`.
Result: Firm A cosine UNIMODAL (p=0.17, pure non-hand-signed population);
full-sample cosine MULTIMODAL (p<0.001, mix of two regimes);
accountant-level aggregates MULTIMODAL on both cos and dHash.
- Script 16 (Burgstahler-Dichev / McCrary): discretised Z-score transition
detection. Firm A and full-sample cosine transitions at 0.985; dHash
at 2.0.
- Script 17 (Beta mixture EM + logit-GMM): 2/3-component Beta via EM
with MoM M-step, plus parallel Gaussian mixture on logit transform
as White (1982) robustness check. Beta-3 BIC < Beta-2 BIC at signature
level confirms 2-component is a forced fit -- supporting the pivot
to accountant-level mixture.
- Script 18 (Accountant-level GMM): rebuilds the 2026-04-16 analysis
that was done inline and not saved. BIC-best K=3 with components
matching prior memory almost exactly: C1 (cos=0.983, dh=2.41, 20%,
Deloitte 139/141), C2 (0.954, 6.99, 51%, KPMG/PwC/EY), C3 (0.928,
11.17, 28%, small firms). 2-component natural thresholds:
cos=0.9450, dh=8.10.
- Script 19 (Pixel-identity validation): no human annotation needed.
Uses pixel_identical_to_closest (310 sigs) as gold positive and
Firm A as anchor positive. Confirms Firm A cosine>0.95 = 92.51%
(matches prior 2026-04-08 finding of 92.5%), dual rule
cos>0.95 AND dhash_indep<=8 captures 89.95% of Firm A.
Python deps added: diptest, scikit-learn (installed into venv).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>