Paper A v3.10: resolve Opus 4.7 round-9 paper-vs-Appendix-A contradiction

Opus round-9 review (paper/opus_final_review_v3_9.md) dissented from Gemini round-7 Accept and aligned with codex round-8 Minor, but for a DIFFERENT issue all prior reviewers missed: the paper's main text in four locations flatly claimed the BD/McCrary accountant-level null "persists across the Appendix-A bin-width sweep", yet Appendix A Table A.I itself documents a significant accountant-level cosine transition at bin 0.005 with |Z_below|=3.23, |Z_above|=5.18 (both past 1.96) located at cosine 0.980 --- on the upper edge of our two threshold estimators' convergence band [0.973, 0.979]. This is a paper-to-appendix contradiction that a careful reviewer would catch in 30 seconds. BLOCKER B1: BD/McCrary accountant-level claim softened across all four locations to match what Appendix A Table A.I actually reports: - Results IV-D.1 (lines 85-86): rewritten to say the null is not rejected at 2/3 cosine bin widths and 2/3 dHash bin widths, with the one cosine transition at bin 0.005 sitting on the upper edge of the convergence band and the one dHash transition at |Z|=1.96. - Results IV-E Table VIII row (line 145): "no transition / no transition" changed to "0.980 at bin 0.005 only; null at 0.002, 0.010" / "3.0 at bin 1.0 only ( |Z|=1.96); null at 0.2, 0.5". - Results IV-E line 130 (Third finding): "does not produce a significant transition (robust across bin-width sweep)" replaced with "largely null at the accountant level --- no significant transition at 2/3 cosine bin widths and 2/3 dHash bin widths, with the one cosine transition at bin 0.005 sitting at cosine 0.980 on the upper edge of the convergence band". - Results IV-E line 152 (Table VIII synthesis paragraph): matched reframing. - Discussion V-B (line 27): "does not produce a significant transition at the accountant level either" -> "largely null at the accountant level ... with the one cosine transition on the upper edge of the convergence band". - Conclusion (line 16): matched reframing with power caveat retained. MAJOR M1: Related Work L67 stale "well suited to detecting the boundary between two generative mechanisms" framing (residue from pre-demotion drafts) replaced with a local-density-discontinuity diagnostic framing that matches the rest of the paper and flags the signature-level bin-width sensitivity + accountant-level rarity as documented in Appendix A. MAJOR M2: Table XII orphaned in-text anchor --- Table XII is defined inside IV-G.3 but had no in-text "Table XII reports ..." pointer at its presentation location. Added a single sentence before the table comment. MINOR m1: Section IV-I.1 "4 of 30,000+ Firm A documents, 0.01%" replaced with the exact "4 of 30,226 Firm A documents, 0.013%". MINOR m2: Section IV-E "the two-dimensional two-component GMM" wording ambiguity (reader might confuse with the already-selected K*=3 GMM from BIC) replaced with explicit "a separately fit two-component 2D GMM (reported as a cross-check on the 1D accountant-level crossings)". MINOR m3: Section IV-D L59 "downstream all-pairs analyses (Tables XII, XVIII)" misnomer --- Table XII is per-signature classifier output not all-pairs; Table XVIII's all-pairs are over ~16M pairs not 168,740. Replaced with an accurate list: "same-CPA per-signature best-match analyses (Tables V and XII, and the Firm-A per-signature rows of Tables XIII and XVIII)". MINOR m4: Methodology III-H L156 "the validation role is played by ... the held-out Firm A fold" slightly overclaims what the held-out fold establishes (the fold-level rates differ by 1-5 pp with p<0.001). Parenthetical hedge added: "(which confirms the qualitative replication-dominated framing; fold-level rate differences are disclosed in Section IV-G.2)". Also add: - paper/opus_final_review_v3_9.md (Opus 4.7 max-effort review) - paper/gemini_review_v3_8.md (Gemini round-7 Accept verdict, was missing from prior commit) Abstract remains 243 words (under IEEE Access 250 limit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:25:04 +08:00
parent 85cfefe49f
commit 615059a2c1
7 changed files with 327 additions and 12 deletions
@@ -0,0 +1,246 @@
+# Paper A v3.9 — Final Independent Peer Review (Opus 4.7)
+
+**Reviewer:** Claude Opus 4.7 (1M context), independent round 9
+**Date:** 2026-04-21
+**Commit reviewed:** 85cfefe
+**Target venue:** IEEE Access (Regular Paper)
+**Prior rounds reviewed:** codex v3.3 / v3.4 / v3.5 / v3.8 (Minor Revision each), Gemini v3.7 (Accept), Gemini v3.8 (Accept), codex v3.8 (Minor Revision)
+
+---
+
+## 1. Overall verdict
+
+**Minor Revision.** I dissent from the Gemini-3.1-Pro round-7 Accept verdict and align with codex round-8's Minor judgment, but for a *different* set of issues that both codex and Gemini missed. The v3.9 edits to Table XV and to the two explicit cross-reference breakages did land cleanly and close codex's round-8 findings. However, in the same revision cycle the paper accumulated an **internally contradicted BD/McCrary accountant-level claim**: multiple locations in the main text (Section IV-D.1, Section IV-E Table VIII note, Section V-B, Conclusion) assert flatly that BD/McCrary "does not produce a significant transition" at the accountant level and that the null "persists across the Appendix-A bin-width sweep," yet Appendix A Table A.I itself documents (i) an accountant-level cosine transition at bin-width 0.005 with $z_{\text{below}}=-3.23$, $z_{\text{above}}=+5.18$ (clearly |Z|>1.96) and (ii) an accountant-level dHash transition at bin-width 1.0 with $z_{\text{below}}=-2.00$, $z_{\text{above}}=+3.24$. Appendix A acknowledges the latter marginally; the main text denies both. The substantive argument of the paper (smoothly-mixed accountant aggregates) is *not* threatened because (a) the transition at bin 0.005 is outside the convergence band anyway and (b) the dHash transition is exactly at the |Z|=1.96 boundary, but the **paper-to-appendix internal contradiction is a reviewer-facing red flag that a competent accountant-statistics reviewer will catch instantly**. This must be fixed before submission. All other issues I found are clean cosmetic/clarity items. The paper is otherwise ready.
+
+---
+
+## 2. v3.8 → v3.9 delta verification
+
+I re-verified both round-8 fixes against their authoritative sources.
+
+**Fix 1: Table XV per-year Firm A baseline-share column.** Verified directly against `reports/partner_ranking/partner_ranking_report.md` (generated 2026-04-21 01:55:27, paper commit same day). All 11 yearly values match exactly: 2013 32.4%, 2014 27.8%, 2015 27.7%, 2016 26.2%, 2017 27.2%, 2018 26.5%, 2019 27.0%, 2020 27.7%, 2021 28.7%, 2022 28.3%, 2023 27.4%. The fix is complete and correct. Codex's numerical-impossibility argument (97/324 floor = 29.9% > prior 26.2%) no longer applies. (results_v3.md lines 331–341)
+
+**Fix 2: Cross-reference corrections.**
+* "Section IV-F" → "Section IV-J" for the ablation study: methodology_v3.md line 87 correctly reads `(Section IV-J)`, and results_v3.md line 412 defines `## J. Ablation Study: Feature Backbone Comparison`. Verified.
+* Table XVIII note "Tables IV/VI" → "Table XIII": results_v3.md lines 429–432 now refer to Table XIII for the best-match mean comparison. Verified.
+
+**No regressions detected in the v3.8→v3.9 edits themselves.** I re-validated the full section/sub-section reference map (III-A…III-M, IV-A…IV-J, IV-D.1/2, IV-G.1/2/3/4, IV-H.1/2/3, IV-I.1/2, V-A…V-G, VI) and every textual `Section X-Y(.Z)` reference resolves to an existing target. All 41 references [1]–[41] are cited in the body.
+
+---
+
+## 3. Numerical audit findings (spot-check against scripts)
+
+I verified 19 numerical claims against authoritative reports under `reports/`. All pass.
+
+| # | Paper claim | Source | Verified |
+|---|-------------|--------|----------|
+| 1 | Table IX whole-Firm-A cos>0.837 = 99.93% (60,408/60,448) | validation_recalibration.json whole_firm_a | ✓ |
+| 2 | Table IX cos>0.9407 = 95.15% (57,518/60,448) | same | ✓ (57518/60448=95.1529%) |
+| 3 | Table IX cos>0.95 = 92.51% (55,922/60,448) | same | ✓ |
+| 4 | Table IX cos>0.973 = 79.45% (48,028/60,448) | same | ✓ |
+| 5 | Table IX dual cos>0.95 AND dh≤8 = 89.95% (54,370/60,448) | same | ✓ |
+| 6 | Table XI calib cos>0.9407 = 94.99%, z=-3.19, p=0.0014 | validation_recalibration.json generalization_tests | ✓ |
+| 7 | Table XI held-out cos>0.9407 = 95.63% (14,662/15,332) | same | ✓ (rate 0.9563) |
+| 8 | Table V Firm A cos dip=0.0019, p=0.169 | dip_test_report.md | ✓ |
+| 9 | Table V Firm A dHash dip=0.1051, p<0.001 | same | ✓ |
+| 10 | Table V all-CPA 168,740 cos dip=0.0035 | same | ✓ |
+| 11 | Table VIII accountant KDE antimode cos=0.973 | accountant_three_methods_report.md | ✓ (0.9726) |
+| 12 | Table VIII accountant Beta-2 cos=0.979 | same | ✓ (0.9788) |
+| 13 | Table VIII accountant logit-GMM cos=0.976 | same | ✓ (0.9759) |
+| 14 | Table VIII accountant 2D-GMM marginal cos=0.945 | same | ✓ (0.9450) |
+| 15 | Table X FAR at 0.837=0.2062, CI [0.2027, 0.2098] | expanded_validation_report.md | ✓ |
+| 16 | Table X FAR at 0.973=0.0003 | same | ✓ |
+| 17 | Table XIV Firm A baseline 27.8% (1287/4629) | partner_ranking_report.md | ✓ |
+| 18 | 3.5× top-10% concentration ratio (95.9/27.8) | arithmetic | ✓ (3.45→3.5×) |
+| 19 | Table XVI Firm A intra-report 89.91% agreement | (26435+734+0+4)/30222 | ✓ (89.91%) |
+
+**Minor numerical imprecision (cosmetic, not blocker).** Results §IV-I.1 says "The absence of any meaningful 'likely hand-signed' rate (4 of 30,000+ Firm A documents, 0.01%) implies…" The true value is 4/30,226 = **0.013%**. Rounding 0.013% to "0.01%" is unusual; "0.013%" or "~0.01%" would be more accurate. (results_v3.md line 404)
+
+**Subtle inconsistency between two scripts (NOT paper's fault, flag-only).** `expanded_validation_report.md` records held-out `cos>0.9407` as k=14,664 (95.64%), while `validation_recalibration.json` records k=14,662 (95.63%). The paper cites the latter (authoritative), so the paper is internally self-consistent. The drift is in the underlying Script 22/24 pair and may be worth reconciling in the reproducibility package (the paper names only Script 24 in its captions, which is correct).
+
+---
+
+## 4. Cross-reference audit findings
+
+I enumerated every `Section X-Y(.Z)` and `Table [roman]` reference in the submission files and checked resolution.
+
+* All 32 distinct section references resolve. No dangling targets.
+* All 18 tables (I–XVIII plus A.I) defined are used at least once **except** Table XII, which is defined (results §IV-G.3) but the only textual mentions of "Table XII" are in the aggregation sentence at results line 59 ("downstream all-pairs analyses (Tables XII, XVIII)"), not at the point where Table XII is first presented.
+  * **Issue (MINOR):** results_v3.md §IV-G.3 (lines 245–268) introduces Table XII as "the Classifier Sensitivity … table" without any in-text `Table XII` numeral reference. A reader looking for the anchor will find it only in the earlier cross-reference at line 59, which is confusing. Add an explicit "Table XII reports …" or "… (Table XII) …" at line 252. This is exactly the sort of orphaned-table issue that IEEE Access copyediting catches.
+
+* **Issue (MINOR clarity — not broken, but misleading):** results_v3.md line 59 characterises Tables XII and XVIII as "downstream all-pairs analyses" that share the 168,740 count. Table XII is the per-signature classifier output (168,740) — not all-pairs — and Table XVIII's all-pairs intra-class stats are over 41.35M all-CPA pairs or 16M Firm-A-only pairs, not 168,740. The 15-signature exclusion described in line 59 does affect the 168,740 signature set (which is the unit in Tables V, XII, and Firm-A rows of XIII), but labelling them "all-pairs analyses" is a misnomer. Recommend: replace "(Tables XII, XVIII)" with "(Tables V, XII, and the Firm-A per-signature statistics of Tables XIII and XVIII)" or simply "(all same-CPA per-signature best-match analyses)".
+
+* Figures 1–4 are referenced; captions are elsewhere in the export pipeline and I did not audit PNG files. No textual figure-reference is broken.
+
+---
+
+## 5. Arithmetic audit findings
+
+I recomputed every `X%`, `k of N`, `k/n` and ratio I could find. Results:
+
+| Claim | Computed | Paper | Status |
+|-------|----------|-------|--------|
+| 182,328 / 86,071 docs avg | 2.118 | — | — |
+| 182,328 / 85,042 with-detections | 2.144 | "2.14 sigs/doc" | ✓ (docs-with-detections denominator) |
+| 85,042 / 86,071 | 98.80% | "98.8%" | ✓ |
+| 168,755 / 182,328 | 92.55% | "92.6%" | ✓ |
+| 85,042 − 84,386 | 656 | "656 documents" | ✓ |
+| 29,529 + 36,994 + 5,133 + 12,683 + 47 | 84,386 | ✓ | ✓ |
+| 29,529 / 84,386 | 35.00% | "35.0%" | ✓ |
+| 22,970 / 30,226 | 75.99% | "76.0%" | ✓ |
+| (22,970+6,311) / 30,226 | 96.87% | "96.9%" | ✓ |
+| 26,435 / 30,222 | 87.47% | "87.5%" | ✓ |
+| (26,435+734+0+4) / 30,222 | 89.91% | "89.91%" | ✓ |
+| 4 / 30,226 | 0.0132% | "0.01%" | **△ should be 0.013%** |
+| 141 + 361 + 184 | 686 | GMM total | ✓ |
+| 0.21 + 0.51 + 0.28 | 1.00 | GMM weights | ✓ |
+| 139 / 171 | 81.3% | "81%" | ✓ |
+| 32 / 171 | 18.7% | "19%" (§V-C) | ✓ |
+| 29,529 / 71,656 | 41.21% | "41.2%" | ✓ |
+| 36,994 / 71,656 | 51.63% | "51.7%" | ✓ |
+| 5,133 / 71,656 | 7.16% | "7.2%" | ✓ |
+| 95.9 / 27.8 | 3.45 | "3.5×" | ✓ |
+| 90.1 / 27.8 | 3.24 | "3.2×" | ✓ |
+| 139+32 = 171; 141-139 | 2 | non-Firm-A in C1 | ✓ |
+| cos>0.95: 92.51%, below: 7.49% | "92.5% / 7.5%" | ✓ | ✓ |
+| Abstract word count | 244 | ≤250 | ✓ |
+
+**One non-blocking integrity note.** Intro line 54: "92.5% of Firm A signatures exceed cosine 0.95 but 7.5% fall below". This is the *whole-sample* Firm A rate (55,922/60,448 = 92.51%). Methodology §III-H line 147 and §V-C line 42 reuse the same 92.5% / 7.5% split. **Consistent** across locations.
+
+---
+
+## 6. Narrative / consistency findings
+
+### 6.1 BD/McCrary accountant-level claim — **main-text vs Appendix A contradiction (MAJOR)**
+
+This is the principal finding of my round. Three locations in the main text state or imply that BD/McCrary produces *no* significant accountant-level transition and that this null persists across the bin-width sweep:
+
+1. **results_v3.md §IV-D.1, lines 85–86:** "At the accountant level the test does not produce a significant transition in either the cosine-mean or the dHash-mean distribution, and this null persists across the Appendix-A bin-width sweep."
+
+2. **results_v3.md §IV-E Table VIII row (line 145):** `| Accountant-level, BD/McCrary transition (diagnostic; null across Appendix A) | no transition | no transition |`
+
+3. **results_v3.md §IV-E line 130, line 152; discussion_v3.md §V-B line 27; conclusion_v3.md line 16:** variants of "BD/McCrary finds no significant transition at the accountant level".
+
+But `reports/bd_sensitivity/bd_sensitivity.md` (and Appendix A Table A.I lines 23–28) actually report:
+
+* Accountant cosine bin 0.005: transition at 0.9800 with $z_{\text{below}}=-3.23$, $z_{\text{above}}=+5.18$ — **both exceed |1.96|, 1 significant transition.**
+* Accountant cosine bin 0.002: no transition; bin 0.010: no transition.
+* Accountant dHash bin 1.0: transition at 3.0 with $z_{\text{below}}=-2.00$, $z_{\text{above}}=+3.24$ — **|Z|=2.00 just above critical, 1 marginal transition.**
+* Accountant dHash bin 0.2: no transition; bin 0.5: no transition.
+
+Appendix A itself (line 36) acknowledges the dHash marginal transition ("the one marginal transition it does produce … sits exactly at the critical value for α = 0.05") but is **silent about the bin-0.005 cosine transition at 0.980**, even though the $|Z|$ values ($-3.23$ / $+5.18$) are well past the 1.96 cutoff and the accountant-level cosine convergence band the paper anchors its primary threshold to is $[0.973, 0.979]$ — i.e., the BD/McCrary transition at 0.980 sits **directly at the upper edge of that convergence band**, not outside it.
+
+**Substantive implication.** The paper's "smoothly-mixed cluster" narrative is not falsified by this — two of three cosine bin widths and two of three dHash bin widths do produce no transition, and one can still argue the pattern is "largely absent." But the paper currently claims something stronger than the data supports, namely that the null is unqualified at the accountant level. A reviewer who reads Appendix A Table A.I against Section IV-D.1 will see the contradiction within 30 seconds.
+
+**Fix.** Either (a) soften the main-text language to "the BD/McCrary accountant-level test rejects the smoothness null in only one of three cosine bin widths and one of three dHash bin widths; the pattern is largely but not uniformly null" (matching Appendix A's own hedging), or (b) additionally note in Appendix A the bin-0.005 cosine transition and explain why it does not disturb the substantive reading (e.g., sits at the band edge, $Z$ inflates with bin width as documented, consistent with a mild histogram-resolution artifact). Option (b) is stronger. **Either way the four locations in §IV-D.1 / Table VIII / §IV-E / §V-B / conclusion must be brought into alignment with Appendix A.**
+
+### 6.2 Related Work line 67 — stale BD/McCrary framing (MINOR)
+
+related_work_v3.md line 67: "The BD/McCrary pairing is well suited to detecting the boundary between two generative mechanisms (non-hand-signed vs. hand-signed) under minimal distributional assumptions."
+
+The rest of the paper (Methodology §III-I.3, Results §IV-D.1, Appendix A) has **demoted** BD/McCrary from a threshold estimator to a density-smoothness diagnostic precisely because it does *not* cleanly detect that boundary (transitions sit inside the non-hand-signed mode, not between modes). Related Work's enthusiastic framing is residue from the v3.6-and-earlier framing and should be softened to something like "BD/McCrary provides a local-density-discontinuity diagnostic that is informative about distributional smoothness under minimal assumptions." This is a related-work-intent question only; the downstream text handles the nuance correctly.
+
+### 6.3 "0.01%" vs "0.013%" (MINOR)
+
+results_v3.md §IV-I.1 line 404: "4 of 30,000+ Firm A documents, 0.01%". True value 0.013%; reviewers who recompute will flag. Replace with "0.013%" or "roughly 0.01%".
+
+### 6.4 No substantive abstract-vs-body contradictions detected
+
+I cross-checked the abstract's quantitative claims (threshold convergence within ∼0.006 at cosine ≈0.975, FAR ≤ 0.001 at accountant-level thresholds, 310 byte-identical positives, ∼50,000-pair inter-CPA negative anchor, 182,328 signatures / 90,282 reports / 758 CPAs / 2013–2023) against the body and all match.
+
+### 6.5 No terminology drift detected
+
+`dHash` / `dHash_indep` / `independent minimum dHash` are defined in §III-G and used consistently; the operational classifier §III-L is explicit that it uses the independent-minimum variant; Tables IX/XI/XII/XVI all use that variant. Previous reviewers correctly flagged this; v3.9 is clean.
+
+---
+
+## 7. Novel issues no prior reviewer caught
+
+Beyond item **6.1 (BD/McCrary main-vs-appendix contradiction)**, which is the primary novel finding, I identified:
+
+### 7.1 Orphaned Table XII first reference
+
+Table XII is defined inside §IV-G.3 (results line 252) but the sub-section opens at line 245 without an in-text `Table XII` reference. The only textual `Table XII` string in the paper is in the line-59 aggregation sentence. A first-reader following the narrative has no numeric pointer to the table at the point of presentation. No prior reviewer flagged this. Fix: insert "Table XII presents the five-way output under each cut." before line 252 `<!-- TABLE XII: ... -->` comment, or similar.
+
+### 7.2 Section IV-E wording ambiguity around "the two-component GMM"
+
+results_v3.md line 131: "For completeness we also report the two-dimensional two-component GMM's marginal crossings at cosine = 0.945 and dHash = 8.10".
+
+This is ambiguous because §IV-E has *already* selected $K^*=3$ on BIC at line 103. The 2-component 2D fit here is an additional, separately-fit 2-comp 2D GMM reported for cross-check only. A reader can reasonably wonder whether this is the same fit at $K=3$ (it is not) or a parallel $K=2$ fit used only for the marginal crossings (it is). Fix: replace "the two-dimensional two-component GMM" with "a separately fit two-component 2D GMM (reported for cross-check of the 1D accountant-level crossings)".
+
+### 7.3 Subtle overclaim in `Methodology §III-H line 156`
+
+methodology_v3.md line 156: "We emphasize that the 92.5% figure is a within-sample consistency check rather than an independent validation of Firm A's status; the validation role is played by the visual inspection, the accountant-level mixture, the three complementary analyses above, and the held-out Firm A fold described in Section III-K."
+
+However, as results §IV-G.2 cautions, the 70/30 held-out fold's operational rules differ between folds by 1–5 pp with $p<0.001$. The held-out fold therefore confirms the *qualitative* replication-dominated framing but does **not** provide clean quantitative validation. Calling it part of "the validation role" is slightly stronger than the results section is willing to say. Fix: replace "held-out Firm A fold" with "held-out Firm A fold (which confirms the qualitative replication-dominated framing; fold-level rate differences are disclosed in Section IV-G.2)".
+
+### 7.4 Abstract's "visual inspection and accountant-level mixture evidence"
+
+abstract_v3.md line 5: "… visual inspection and accountant-level mixture evidence supporting majority non-hand-signing and a minority of hand-signers". This omits the partner-level ranking analysis (§IV-H.2), which is the only **threshold-free** piece of evidence and is the strongest of the four. Including it in the one-sentence evidence summary would sharpen the abstract. Non-blocking: the abstract is already at 244/250 words.
+
+### 7.5 `Section III-I.4` never referenced
+
+methodology_v3.md defines subsections III-I.1 (KDE), III-I.2 (Beta mixture EM), III-I.3 (BD/McCrary), III-I.4 (Convergent Validation), III-I.5 (Accountant-Level Application). Only III-I.3 and III-I.5 are referenced in text. III-I.4's substantive content (level-shift framing) is summarised in §IV-E and §V-B; the standalone subsection could be folded into III-I.5 or III-I.1, or a forward-reference could be added. Non-blocking, but IEEE Access copyediting may flag a subsection with no cross-reference.
+
+### 7.6 BD/McCrary-as-threshold-estimator trace in Conclusion
+
+conclusion_v3.md line 14: "Third, we introduced a convergent threshold framework combining two methodologically distinct estimators … together with a Burgstahler-Dichev / McCrary density-smoothness diagnostic."
+
+This is fine — diagnostic, not estimator — and matches methodology §III-I.3 framing. But it contrasts with introduction_v3.md line 43–44 which still reads "(5) threshold determination using two methodologically distinct estimators … complemented by a Burgstahler-Dichev / McCrary density-smoothness diagnostic …". Self-consistent. I verified there is no stale "three-method threshold" residue. v3.9 is clean on this.
+
+---
+
+## 8. Final recommendation — v3.10 action items
+
+### BLOCKER (must fix before submission)
+
+**B1. BD/McCrary accountant-level claim contradicts Appendix A.** (See §6.1.)
+* File: `paper_a_results_v3.md`, §IV-D.1, lines 85–86.
+* Change: "At the accountant level the test does not produce a significant transition in either the cosine-mean or the dHash-mean distribution, and this null persists across the Appendix-A bin-width sweep."
+* Replace with: "At the accountant level the BD/McCrary null is not rejected in two of three cosine bin widths (0.002 and 0.010) and two of three dHash bin widths (0.2 and 0.5); the one cosine transition (at bin width 0.005) sits at cosine 0.980 — at the upper edge of the convergence band of our two threshold estimators (Section IV-E) — and the one dHash transition (at bin width 1.0) has $|Z|$ at the 1.96 critical value. We read this pattern as *largely* null and report it as consistent with, rather than affirmative proof of, clustered-but-smoothly-mixed accountant-level aggregates (Appendix A)."
+* File: `paper_a_results_v3.md`, §IV-E Table VIII row (line 145). Change `null across Appendix A` to `largely null; 1/3 cos and 1/3 dHash bin widths exhibit a marginal transition (Appendix A)`.
+* File: `paper_a_discussion_v3.md` §V-B line 27 and `paper_a_conclusion_v3.md` line 16 — apply matching softening.
+
+### MAJOR (strongly recommended before submission)
+
+**M1. Related Work BD/McCrary framing stale.** (See §6.2.)
+* File: `paper_a_related_work_v3.md` line 67.
+* Soften "is well suited to detecting the boundary between two generative mechanisms" to "provides a local-density-discontinuity diagnostic that is informative about distributional smoothness".
+
+**M2. Orphaned Table XII first reference.** (See §7.1.)
+* File: `paper_a_results_v3.md` line 252, immediately before the `<!-- TABLE XII: … -->` comment.
+* Insert: "Table XII reports the five-way classifier output under both operational cuts."
+
+### MINOR (nice-to-have)
+
+**m1.** results_v3.md line 404: replace "0.01%" with "0.013%".
+
+**m2.** results_v3.md line 131: replace "the two-dimensional two-component GMM" with "a separately fit two-component 2D GMM (reported for cross-check of the 1D accountant-level crossings)".
+
+**m3.** results_v3.md line 59: replace "(Tables XII, XVIII)" with "(all same-CPA per-signature best-match analyses, including Tables V, XII, and XVIII)" to remove the "all-pairs" misnomer.
+
+**m4.** methodology_v3.md line 156: replace "the held-out Firm A fold described in Section III-K" with "the held-out Firm A fold (which confirms the qualitative replication-dominated framing; fold-level rate differences are disclosed in Section IV-G.2)".
+
+**m5.** abstract_v3.md (optional, non-blocking): consider inserting "the threshold-free partner-ranking analysis," before "and a minority of hand-signers" if word budget allows.
+
+**m6.** methodology_v3.md §III-I.4 never cross-referenced (§7.5). Either add one forward reference or fold into §III-I.1/5. Non-blocking.
+
+### Submission-readiness summary
+
+With **B1** addressed the paper is submission-ready. **M1** and **M2** are strongly recommended but would not by themselves be grounds for rejection. All **m1–m6** items are cosmetic.
+
+### IEEE Access compliance check
+
+* Abstract word count: 244 / 250 ✓
+* Impact statement correctly removed from submission via export_v3.py SECTIONS list ✓
+* Single-anonymized: "Firm A / B / C / D" pseudonyms used consistently, residual identifiability disclosed (methodology §III-M) ✓
+* Reference formatting: IEEE numbered, sequential by first appearance, 41 entries, all cited ✓
+* No author/institution information in v3 section files ✓
+* Figures 1–4 referenced; Table A.I defined in appendix with consistent IEEE prefix ✓
+* Appendix A correctly titled "Appendix A. BD/McCrary Bin-Width Sensitivity" and appears after Conclusion in the assembly order ✓
+
+**Reviewer's bottom line.** The paper is well-crafted, numerically rigorous, and has survived eight prior review rounds. v3.9 closed both codex round-8 items cleanly. The one residual issue I identified (**B1**) is a paper-vs-appendix contradiction that any careful round-10 reviewer will catch. It is fixable in 20 minutes by softening four sentences. After that fix the paper is ready for IEEE Access submission.
+
+---
+
+*End of review.*