Paper A v3.11: reframe Section III-G unit hierarchy + propagate implications

Rewrites Section III-G (Unit of Analysis and Summary Statistics) after self-review identified three logical issues in v3.10: 1. Ordering inversion: the three units are now ordered signature -> auditor-year -> accountant, with auditor-year as the principled middle unit under within-year assumptions and accountant as a deliberate cross-year pooling. 2. Oversold assumption: the old "within-auditor-year no-mixing identification assumption" is split into A1 (pair-detectability, weak statistical, cross-year scope matching the detector) and A2 (within-year label uniformity, interpretive convention). The arithmetic statistics reported in the paper do not require A2; A2 only underwrites interpretive readings (notably IV-H.1's partner- level "minority of hand-signers" framing). 3. Motivation-assumption mismatch: removed the "longitudinal behaviour of interest" framing and explicitly disclaimed across-year homogeneity. Accountant-level coordinates are now described as a pooled observed tendency rather than a time-invariant regime. Propagated implications across Introduction, Discussion, and Results: softened "tends to cluster into a dominant regime" and "directly quantifying the minority of hand-signers" to "pooled observed tendency" / "consistent with within-firm heterogeneity"; rewrote the Limitations fifth point (was "treats all signatures from a CPA as a single class"); added a seventh Limitation acknowledging the source-template edge case; added a per-signature best-match cross-year caveat to Section IV-H.2; softened IV-H.2's "direct consequence" to "consistent with"; reframed pixel-identity anchor as pair-level proof of image reuse (with source-template exception) rather than absolute signature-level positive. Process: self-review (9 findings) -> full-pass fixes -> codex gpt-5.5 xhigh round-10 verification (8 RESOLVED, 1 PARTIAL, 4 MINOR regression findings) -> regression fixes. No re-computation. All tables (IV-XVIII) and Appendix A numbers unchanged. Abstract at 248/250 words. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:52:45 +08:00
parent 615059a2c1
commit d2f8673a67
5 changed files with 41 additions and 22 deletions
@@ -69,7 +69,7 @@ The $N = 168{,}740$ count used in Table V and in the downstream same-CPA per-sig
 | Per-accountant dHash mean | 686 | 0.0277 | <0.001 | Multimodal |
 -->

-Firm A's per-signature cosine distribution is *unimodal* ($p = 0.17$), reflecting a single dominant generative mechanism (non-hand-signing) with a long left tail attributable to the minority of hand-signing Firm A partners identified in the accountant-level mixture (Section IV-E).
+Firm A's per-signature cosine distribution is *unimodal* ($p = 0.17$), reflecting a single dominant generative mechanism (non-hand-signing) with a long left tail attributable to within-firm heterogeneity---consistent with a minority of hand-signing Firm A partners---as identified in the accountant-level mixture (Section IV-E).
 The all-CPA cosine distribution, which mixes many firms with heterogeneous signing practices, is *multimodal* ($p < 0.001$).
 At the per-accountant aggregate level both cosine and dHash means are strongly multimodal, foreshadowing the mixture structure analyzed in Section IV-E.

@@ -184,7 +184,7 @@ We report three validation analyses corresponding to the anchors of Section III-

 ### 1) Pixel-Identity Positive Anchor with Inter-CPA Negative Anchor

-Of the 182,328 extracted signatures, 310 have a same-CPA nearest match that is byte-identical after crop and normalization (pixel-identical-to-closest = 1); these form the gold-positive anchor.
+Of the 182,328 extracted signatures, 310 have a same-CPA nearest match that is byte-identical after crop and normalization (pixel-identical-to-closest = 1); these form the byte-identity positive anchor---a pair-level proof of image reuse that serves as conservative ground truth for non-hand-signed signatures, subject to the source-template edge case discussed in Section V-G.
 As the gold-negative anchor we sample 50,000 random cross-CPA signature pairs (inter-CPA cosine: mean $= 0.762$, $P_{95} = 0.884$, $P_{99} = 0.913$, max $= 0.988$).
 Because the positive and negative anchor populations are constructed from different sampling units (byte-identical same-CPA pairs vs random inter-CPA pairs), their relative prevalence in the combined anchor set is arbitrary, and precision / $F_1$ / recall therefore have no meaningful population interpretation.
 We accordingly report FAR with Wilson 95% confidence intervals against the large inter-CPA negative anchor in Table X.
@@ -314,6 +314,7 @@ We test this prediction directly.
 For each auditor-year (CPA $\times$ fiscal year) with at least 5 signatures we compute the mean best-match cosine similarity across the year's signatures, yielding 4,629 auditor-years across 2013-2023.
 Firm A accounts for 1,287 of these (27.8% baseline share).
 Table XIV reports per-firm occupancy of the top $K\%$ of the ranked distribution.
+The per-signature best-match cosine underlying each auditor-year mean is taken over the full same-CPA pool (Section III-G) and may match against signatures from other fiscal years, so the auditor-year mean reflects the year's signatures' position within the CPA's full-sample similarity structure rather than purely within-year similarity; a within-year-restricted sensitivity replication is a natural robustness check and is left to future work.

 <!-- TABLE XIV: Top-K Similarity Rank Occupancy by Firm (pooled 2013-2023)
 | Top-K | k in bucket | Firm A | Firm B | Firm C | Firm D | Non-Big-4 | Firm A share |
@@ -342,7 +343,7 @@ Year-by-year (Table XV), the top-10% Firm A share ranges from 88.4% (2020) to 10
 | 2023 | 474 | 47 | 46 | 97.9% | 27.4% |
 -->

-This over-representation is a direct consequence of firm-wide non-hand-signing practice and is not derived from any threshold we subsequently calibrate.
+This over-representation is consistent with firm-wide non-hand-signing practice at Firm A and is not derived from any threshold we subsequently calibrate.
 It therefore constitutes genuine cross-firm evidence for Firm A's benchmark status.

 ### 3) Intra-Report Consistency