Paper A v3.10: resolve Opus 4.7 round-9 paper-vs-Appendix-A contradiction
Opus round-9 review (paper/opus_final_review_v3_9.md) dissented from Gemini round-7 Accept and aligned with codex round-8 Minor, but for a DIFFERENT issue all prior reviewers missed: the paper's main text in four locations flatly claimed the BD/McCrary accountant-level null "persists across the Appendix-A bin-width sweep", yet Appendix A Table A.I itself documents a significant accountant-level cosine transition at bin 0.005 with |Z_below|=3.23, |Z_above|=5.18 (both past 1.96) located at cosine 0.980 --- on the upper edge of our two threshold estimators' convergence band [0.973, 0.979]. This is a paper-to-appendix contradiction that a careful reviewer would catch in 30 seconds. BLOCKER B1: BD/McCrary accountant-level claim softened across all four locations to match what Appendix A Table A.I actually reports: - Results IV-D.1 (lines 85-86): rewritten to say the null is not rejected at 2/3 cosine bin widths and 2/3 dHash bin widths, with the one cosine transition at bin 0.005 sitting on the upper edge of the convergence band and the one dHash transition at |Z|=1.96. - Results IV-E Table VIII row (line 145): "no transition / no transition" changed to "0.980 at bin 0.005 only; null at 0.002, 0.010" / "3.0 at bin 1.0 only ( |Z|=1.96); null at 0.2, 0.5". - Results IV-E line 130 (Third finding): "does not produce a significant transition (robust across bin-width sweep)" replaced with "largely null at the accountant level --- no significant transition at 2/3 cosine bin widths and 2/3 dHash bin widths, with the one cosine transition at bin 0.005 sitting at cosine 0.980 on the upper edge of the convergence band". - Results IV-E line 152 (Table VIII synthesis paragraph): matched reframing. - Discussion V-B (line 27): "does not produce a significant transition at the accountant level either" -> "largely null at the accountant level ... with the one cosine transition on the upper edge of the convergence band". - Conclusion (line 16): matched reframing with power caveat retained. MAJOR M1: Related Work L67 stale "well suited to detecting the boundary between two generative mechanisms" framing (residue from pre-demotion drafts) replaced with a local-density-discontinuity diagnostic framing that matches the rest of the paper and flags the signature-level bin-width sensitivity + accountant-level rarity as documented in Appendix A. MAJOR M2: Table XII orphaned in-text anchor --- Table XII is defined inside IV-G.3 but had no in-text "Table XII reports ..." pointer at its presentation location. Added a single sentence before the table comment. MINOR m1: Section IV-I.1 "4 of 30,000+ Firm A documents, 0.01%" replaced with the exact "4 of 30,226 Firm A documents, 0.013%". MINOR m2: Section IV-E "the two-dimensional two-component GMM" wording ambiguity (reader might confuse with the already-selected K*=3 GMM from BIC) replaced with explicit "a separately fit two-component 2D GMM (reported as a cross-check on the 1D accountant-level crossings)". MINOR m3: Section IV-D L59 "downstream all-pairs analyses (Tables XII, XVIII)" misnomer --- Table XII is per-signature classifier output not all-pairs; Table XVIII's all-pairs are over ~16M pairs not 168,740. Replaced with an accurate list: "same-CPA per-signature best-match analyses (Tables V and XII, and the Firm-A per-signature rows of Tables XIII and XVIII)". MINOR m4: Methodology III-H L156 "the validation role is played by ... the held-out Firm A fold" slightly overclaims what the held-out fold establishes (the fold-level rates differ by 1-5 pp with p<0.001). Parenthetical hedge added: "(which confirms the qualitative replication-dominated framing; fold-level rate differences are disclosed in Section IV-G.2)". Also add: - paper/opus_final_review_v3_9.md (Opus 4.7 max-effort review) - paper/gemini_review_v3_8.md (Gemini round-7 Accept verdict, was missing from prior commit) Abstract remains 243 words (under IEEE Access 250 limit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -24,7 +24,7 @@ Replication quality varies continuously with scan equipment, PDF compression, st
|
||||
|
||||
At the per-accountant aggregate level the picture partly reverses.
|
||||
The distribution of per-accountant mean cosine (and mean dHash) rejects unimodality, a BIC-selected three-component Gaussian mixture cleanly separates (C1) a high-replication cluster dominated by Firm A, (C2) a middle band shared by the other Big-4 firms, and (C3) a hand-signed-tendency cluster dominated by smaller domestic firms, and the three 1D threshold methods applied at the accountant level produce mutually consistent estimates (KDE antimode $= 0.973$, Beta-2 crossing $= 0.979$, logit-GMM-2 crossing $= 0.976$).
|
||||
The BD/McCrary test, however, does not produce a significant transition at the accountant level either, in contrast to the signature level.
|
||||
The BD/McCrary test is largely null at the accountant level---no significant transition at two of three cosine bin widths and two of three dHash bin widths, and the one cosine transition (at bin 0.005, location 0.980) sits on the upper edge of the convergence band described above rather than outside it (Appendix A).
|
||||
This pattern is consistent with a clustered *but smoothly mixed* accountant-level distribution rather than with a sharp density discontinuity: accountant-level means cluster into three recognizable groups, yet the test fails to reject the smoothness null at the sample size available ($N = 686$), and the GMM cluster boundaries appear gradual rather than sheer.
|
||||
We caveat this interpretation appropriately in Section V-G: the BD null alone cannot affirmatively establish smoothness---only fail to falsify it---and our substantive claim of smoothly-mixed clustering rests on the joint weight of the GMM fit, the dip test, and the BD null rather than on the BD null alone.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user