Partner v4 (signature_paper_draft_v4) proposed 3 substantive improvements;
partner confirmed the 2013-2019 restriction was an error (sample stays
2013-2023). The remaining suggestions are adopted with our own data.
## New scripts
- Script 22 (partner ranking): ranks all Big-4 auditor-years by mean
max-cosine. Firm A occupies 95.9% of top-10% (base 27.8%), 3.5x
concentration ratio. Stable across 2013-2023 (88-100% per year).
- Script 23 (intra-report consistency): for each 2-signer report,
classify both signatures and check agreement. Firm A agrees 89.9%
vs 62-67% at other Big-4. 87.5% Firm A reports have BOTH signers
non-hand-signed; only 4 reports (0.01%) both hand-signed.
## New methodology additions
- III-G: explicit within-auditor-year no-mixing identification
assumption (supported by Firm A interview evidence).
- III-H: 4th Firm A validation line: threshold-independent evidence
from partner ranking + intra-report consistency.
## New results section IV-H (threshold-independent validation)
- IV-H.1: Firm A year-by-year cosine<0.95 rate. 2013-2019 mean=8.26%,
2020-2023 mean=6.96%, 2023 lowest (3.75%). Stability contradicts
partner's hypothesis that 2020+ electronic systems increase
heterogeneity -- data shows opposite (electronic systems more
consistent than physical stamping).
- IV-H.2: partner ranking top-K tables (pooled + year-by-year).
- IV-H.3: intra-report consistency per-firm table.
## Renumbering
- Section H (was Classification Results) -> I
- Section I (was Ablation) -> J
- Tables XIII-XVI new (yearly stability, top-K pooled, top-10% per-year,
intra-report), XVII = classification (was XII), XVIII = ablation
(was XIII).
These threshold-independent analyses address the codex review concern
about circular validation by providing benchmark evidence that does not
depend on any threshold calibrated to Firm A itself.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>