Paper A v3.14: remove A2 assumption + soften all partner-level claims

The within-auditor-year uniformity assumption (A2) introduced in v3.11
Section III-G was empirically tested via a new within-year uniformity
check (signature_analysis/27_within_year_uniformity.py; output in
reports/within_year_uniformity/). The check found that within-year
pairwise cosine distributions even at the calibration firm show
substantial heterogeneity inconsistent with strict single-mechanism
uniformity (Firm A 2023 CPAs typically have median pairwise cosine
around 0.85 with 20-70% of pairs below the all-pairs KDE crossover
0.837). A2 as stated ("a CPA who replicates any signature image in
that year is treated as doing so for every report") is therefore
falsified empirically.

Three explanations are compatible with the data and cannot be
disambiguated without manual inspection: (i) true within-year
mechanism mixing, (ii) multi-template replication workflows at the
same firm within a year, (iii) feature-extraction noise on repeatedly
scanned stamped images. Since A2 is falsified and its implications
cannot be restored under any of the three explanations, we remove
A2 entirely rather than downgrading it to an "approximation" or
"interpretive convention."

Changes applied:

1. Methodology Section III-G: A2 block deleted. Section now has only
   A1 (pair-detectability, cross-year pair-existence). Replaced A2
   with an explicit statement that we make no within-year or
   across-year uniformity assumption, that per-signature labels are
   signature-level quantities throughout, and that we abstain from
   partner-level frequency inferences. Three candidate explanations
   for within-year signature heterogeneity are listed (single-template
   replication, multi-template replication in parallel, within-year
   mixing, or combinations) without attempting disaggregation.

2. Methodology III-H strand 2 (L154) softened: "7.5% form a long left
   tail consistent with a minority of hand-signers" rewritten as
   reflecting "within-firm heterogeneity in signing output (we do not
   disaggregate partner-level mechanism here; see Section III-G)."

3. Methodology III-H visual-inspection strand (L152) and the
   corresponding Discussion V-C first strand (L41) and Conclusion L21
   softened: "for the majority of partners" changed to "for many of
   the sampled partners" (Codex round-14 MAJOR: "majority of partners"
   is itself a partner-level frequency claim under the new scope-of-
   claims regime).

4. Methodology III-K.3 Firm A anchor (L247): dropped "(consistent
   with a minority of hand-signers)" parenthetical.

5. Results IV-D cosine distribution narrative (L72): softened to
   "within-firm heterogeneity in signing outputs (see Section IV-E
   and Section III-G for the scope of partner-level claims)."

6. Results IV-E cluster split framing (L128): "minority-hand-signers
   framing of Section III-H" renamed to "within-firm heterogeneity
   framing of Section III-H" (matches the new III-H text).

7. Results IV-H.1 partner-level reading (L286): removed entirely.
   The v3.13 text "Under the within-year label-uniformity convention
   A2, this left-tail share is read as a partner-level minority of
   hand-signing CPAs" is replaced by a signature-level statement
   that explicitly lists hand-signing partners, multi-template
   replication, or a combination as possibilities without attempting
   attribution.

8. Results IV-H.1 stability argument (L308): softened from "persistent
   minority of hand-signing Firm A partners" to "persistent within-
   firm heterogeneity component," preserving the substantive argument
   that stability across production technologies is inconsistent with
   a noise-only explanation.

9. Results IV-I Firm A Capture Profile (L407): rewrote the "Firm A's
   minority hand-signers have not been captured" phrasing as a
   signature-level framing about the 7.5% left tail not projecting
   into the lowest-cosine document-level category under the dual-
   descriptor rules.

10. Abstract (L5): softened "alongside within-firm heterogeneity
    consistent with a minority of hand-signers" to "alongside residual
    within-firm heterogeneity." Abstract at 244/250 words.

11. Discussion V-C third strand (L43): added "multi-template
    replication workflows" to the list of possibilities and added
    a local "we do not disaggregate these mechanisms; see Section
    III-G for the scope of claims" disclaimer (Codex round-14 MINOR 5).

12. Discussion Limitations: added an Eighth limitation explicitly
    stating that partner-level frequency inferences are not made and
    why (no within-year uniformity assumption is adopted).

13. Methodology L124 opening: "We make one stipulation about within-
    auditor-year structure" fixed to "same-CPA pair detectability,"
    since A1 is a cross-year pair-existence property, not a within-
    year claim (Codex round-14 MINOR 3).

14. Two broken cross-references fixed (Codex round-14 MINOR 6):
    methodology L86 Section V-D -> V-G (Limitations is G, not D which
    is Style-Replication Gap); methodology L167 Section III-I ->
    Section IV-D (the empirical cosine distribution is in IV-D, not
    III-I).

Script 27 and its output (reports/within_year_uniformity/*) remain
in the repository as internal due-diligence evidence but are not
cited from the paper. The paper's substantive claims at signature-
level and accountant (cross-year pooled) level are unchanged; only
the partner-level interpretive overlay is removed. All tables
(IV-XVIII), Appendix A (BD/McCrary sensitivity), and all reported
numbers are unchanged.

Codex round-14 (gpt-5.5 xhigh) verification: Major Revision caused
by one BLOCKER (stale DOCX artifact, not part of this commit) plus
one MAJOR ("majority of partners" partner-frequency claim) plus
four MINOR findings. All five markdown findings addressed in this
commit. DOCX regeneration deferred to pre-submission packaging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-24 22:06:22 +08:00
parent ef0e417257
commit d3b63fc0b7
5 changed files with 25 additions and 26 deletions
+6 -5
View File
@@ -69,7 +69,7 @@ The $N = 168{,}740$ count used in Table V and in the downstream same-CPA per-sig
| Per-accountant dHash mean | 686 | 0.0277 | <0.001 | Multimodal |
-->
Firm A's per-signature cosine distribution is *unimodal* ($p = 0.17$), reflecting a single dominant generative mechanism (non-hand-signing) with a long left tail attributable to within-firm heterogeneity---consistent with a minority of hand-signing Firm A partners---as identified in the accountant-level mixture (Section IV-E).
Firm A's per-signature cosine distribution is *unimodal* ($p = 0.17$), reflecting a single dominant generative mechanism (non-hand-signing) with a long left tail attributable to within-firm heterogeneity in signing outputs (see Section IV-E for the accountant-level mixture evidence and Section III-G for the scope of partner-level claims).
The all-CPA cosine distribution, which mixes many firms with heterogeneous signing practices, is *multimodal* ($p < 0.001$).
At the per-accountant aggregate level both cosine and dHash means are strongly multimodal, foreshadowing the mixture structure analyzed in Section IV-E.
@@ -125,7 +125,7 @@ Table VII reports the three-component composition, and Fig. 4 visualizes the acc
Three empirical findings stand out.
First, of the 180 CPAs in the Firm A registry, 171 have $\geq 10$ signatures and therefore enter the accountant-level GMM (the remaining 9 have too few signatures for reliable aggregates and are excluded from this analysis only).
Component C1 captures 139 of these 171 Firm A CPAs (81%) in a tight high-cosine / low-dHash cluster; the remaining 32 Firm A CPAs fall into C2.
This split is consistent with the minority-hand-signers framing of Section III-H and with the unimodal-long-tail observation of Section IV-D.
This split is consistent with the within-firm heterogeneity framing of Section III-H and with the unimodal-long-tail observation of Section IV-D.
Second, the three-component partition is *not* a firm-identity partition: three of the four Big-4 firms dominate C2 together, and smaller domestic firms cluster into C3.
Third, applying the threshold framework of Section III-I to the accountant-level cosine-mean distribution yields the estimates summarized in the accountant-level rows of Table VIII (below): KDE antimode $= 0.973$, Beta-2 crossing $= 0.979$, and the logit-GMM-2 crossing $= 0.976$ converge within $\sim 0.006$ of each other, while the BD/McCrary density-smoothness diagnostic is largely null at the accountant level---no significant transition at two of three cosine bin widths and two of three dHash bin widths, with the one cosine transition at bin 0.005 sitting at cosine 0.980 on the upper edge of the convergence band (Appendix A).
For completeness we also report the marginal crossings of a *separately fit* two-component 2D GMM (reported as a cross-check on the 1D accountant-level crossings) at cosine $= 0.945$ and dHash $= 8.10$; these differ from the 1D crossings because they are derived from the joint (cosine, dHash) covariance structure rather than from each 1D marginal in isolation.
@@ -283,7 +283,8 @@ Subsection H.3 applies the calibrated classifier and is therefore a consistency
### 1) Year-by-Year Stability of the Firm A Left Tail
Table XIII reports the proportion of Firm A signatures with per-signature best-match cosine below 0.95, disaggregated by fiscal year.
Under the replication-dominated interpretation (Section III-H) and the within-year label-uniformity convention A2 (Section III-G), this left-tail share is read as a partner-level minority of Firm A CPAs who continue to hand-sign rather than as a bare signature-level rate.
Under the replication-dominated interpretation (Section III-H), this signature-level left-tail rate reflects within-firm heterogeneity in signing outputs at Firm A.
Consistent with the scope-of-claims framing in Section III-G, we report the rate as a signature-level quantity without disaggregating the underlying mechanism (which may span a minority of hand-signing partners, multi-template replication workflows within the firm, or a combination); partner-level mechanism attribution is not attempted.
Under the alternative hypothesis that the left tail is an artifact of scan or compression noise, the share should shrink as scanning and PDF-compression technology improved over 2013-2023.
<!-- TABLE XIII: Firm A Per-Year Cosine Distribution
@@ -304,7 +305,7 @@ Under the alternative hypothesis that the left tail is an artifact of scan or co
The left tail is stable at 6-13% throughout the sample period and shows no pre/post-2020 level shift: the 2013-2019 mean left-tail share is 8.26% and the 2020-2023 mean is 6.96%.
The lowest observed share is in 2023 (3.75%), consistent with firm-level electronic signing systems producing more uniform output than earlier manual scanning-and-stamping, not less.
This stability supports the replication-dominated framing: a persistent minority of hand-signing Firm A partners is consistent with a Beta left tail that is stable across production technologies, whereas a noise-only explanation would predict a shrinking share as technology improved.
This stability supports the replication-dominated framing: a persistent within-firm heterogeneity component is consistent with a Beta left tail that is stable across production technologies, whereas a noise-only explanation would predict a shrinking share as technology improved.
### 2) Partner-Level Similarity Ranking
@@ -403,7 +404,7 @@ A cosine-only classifier would treat all 71,656 identically; the dual-descriptor
96.9% of Firm A's documents fall into the high- or moderate-confidence non-hand-signed categories, 0.6% into high-style-consistency, and 2.5% into uncertain.
This pattern is consistent with the replication-dominated framing: the large majority is captured by non-hand-signed rules, while the small residual is consistent with the 32/171 middle-band minority identified by the accountant-level mixture (Section IV-E).
The absence of any meaningful "likely hand-signed" rate (4 of 30,226 Firm A documents, 0.013%; the 30,226 count here is documents with at least one Firm A signer under the 84,386-document classification cohort, which differs from the 30,222 single-firm two-signer subset in Table XVI by 4 reports) implies either that Firm A's minority hand-signers have not been captured in the lowest-cosine tail---for example, because they also exhibit high style consistency---or that their contribution is small enough to be absorbed into the uncertain category at this threshold set.
The near-zero "likely hand-signed" rate (4 of 30,226 Firm A documents, 0.013%; the 30,226 count here is documents with at least one Firm A signer under the 84,386-document classification cohort, which differs from the 30,222 single-firm two-signer subset in Table XVI by 4 reports) indicates that the within-firm heterogeneity implied by the 7.5% signature-level left tail (Section IV-D) does not project into the lowest-cosine document-level category under the dual-descriptor rules; it is absorbed instead into the uncertain or high-style-consistency categories at this threshold set.
We note that because the non-hand-signed thresholds are themselves calibrated to Firm A's empirical percentiles (Section III-H), these rates are an internal consistency check rather than an external validation; the held-out Firm A validation of Section IV-G.2 is the corresponding external check.
### 2) Cross-Method Agreement