Paper A v3.11: reframe Section III-G unit hierarchy + propagate implications
Rewrites Section III-G (Unit of Analysis and Summary Statistics) after self-review identified three logical issues in v3.10: 1. Ordering inversion: the three units are now ordered signature -> auditor-year -> accountant, with auditor-year as the principled middle unit under within-year assumptions and accountant as a deliberate cross-year pooling. 2. Oversold assumption: the old "within-auditor-year no-mixing identification assumption" is split into A1 (pair-detectability, weak statistical, cross-year scope matching the detector) and A2 (within-year label uniformity, interpretive convention). The arithmetic statistics reported in the paper do not require A2; A2 only underwrites interpretive readings (notably IV-H.1's partner- level "minority of hand-signers" framing). 3. Motivation-assumption mismatch: removed the "longitudinal behaviour of interest" framing and explicitly disclaimed across-year homogeneity. Accountant-level coordinates are now described as a pooled observed tendency rather than a time-invariant regime. Propagated implications across Introduction, Discussion, and Results: softened "tends to cluster into a dominant regime" and "directly quantifying the minority of hand-signers" to "pooled observed tendency" / "consistent with within-firm heterogeneity"; rewrote the Limitations fifth point (was "treats all signatures from a CPA as a single class"); added a seventh Limitation acknowledging the source-template edge case; added a per-signature best-match cross-year caveat to Section IV-H.2; softened IV-H.2's "direct consequence" to "consistent with"; reframed pixel-identity anchor as pair-level proof of image reuse (with source-template exception) rather than absolute signature-level positive. Process: self-review (9 findings) -> full-pass fixes -> codex gpt-5.5 xhigh round-10 verification (8 RESOLVED, 1 PARTIAL, 4 MINOR regression findings) -> regression fixes. No re-computation. All tables (IV-XVIII) and Appendix A numbers unchanged. Abstract at 248/250 words. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -56,7 +56,7 @@ Adopting the replication-dominated framing---rather than a near-universal framin
|
||||
|
||||
A third distinctive feature is our unit-of-analysis treatment.
|
||||
Our threshold-framework analysis reveals an informative asymmetry between the signature level and the accountant level: per-signature similarity forms a continuous quality spectrum for which no two-mechanism mixture provides a good fit, whereas per-accountant aggregates are clustered into three recognizable groups (BIC-best $K = 3$).
|
||||
The substantive reading is that *pixel-level output quality* is a continuous spectrum shaped by firm-specific reproduction technologies and scan conditions, while *accountant-level aggregate behaviour* is clustered but not sharply discrete---a given CPA tends to cluster into a dominant regime (high-replication, middle-band, or hand-signed-tendency), though the boundaries between regimes are smooth rather than discontinuous.
|
||||
The substantive reading is that *pixel-level output quality* is a continuous spectrum shaped by firm-specific reproduction technologies and scan conditions, while *accountant-level aggregate behaviour* is clustered but not sharply discrete: each CPA's cross-year-pooled coordinates sit closest to one of three recognizable groups (high-replication, middle-band, or hand-signed-tendency), reflecting a pooled observed tendency rather than a time-invariant regime, with smooth rather than discontinuous boundaries between groups.
|
||||
At the accountant level, the KDE antimode and the two mixture-based estimators (Beta-2 crossing and its logit-Gaussian robustness counterpart) converge within $\sim 0.006$ on a cosine threshold of approximately $0.975$, while the Burgstahler-Dichev / McCrary density-smoothness diagnostic finds no significant transition---an outcome (robust across a bin-width sweep, Appendix A) consistent with smoothly mixed clusters.
|
||||
The two-dimensional GMM marginal crossings (cosine $= 0.945$, dHash $= 8.10$) are reported as a complementary cross-check rather than as the primary accountant-level threshold.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user