# Independent Peer Review: Paper A (v3.8) **Target Venue:** IEEE Access (Regular Paper) **Date:** April 21, 2026 **Reviewer:** Gemini CLI (7th Round Independent Review) --- ## 1. Overall Verdict **Verdict: Accept** **Rationale:** The authors have systematically and thoroughly addressed the three critical methodological and narrative blind spots identified in the Round-6 review. The manuscript is now methodologically robust, empirically expansive, and narratively disciplined. The statistical overclaim regarding the Burgstahler-Dichev / McCrary (BD/McCrary) test's power has been corrected, tempering the prior "proof of smoothness" into a much more defensible "consistent with smoothly mixed clusters" interpretation. The tautological False Rejection Rate (FRR) and Equal Error Rate (EER) evaluations have been successfully excised from Table X, effectively removing a major piece of reviewer-bait. Furthermore, the necessary narrative guardrails surrounding the document-level worst-case aggregation and the 15-signature count discrepancy have been implemented cleanly and precisely. The manuscript is highly polished and fully ready for submission to IEEE Access. --- ## 2. Round-6 Follow-Up Audit In Round 6, three specific issues were flagged for revision. Below is the audit of their resolution in v3.8. ### A. BD/McCrary Power-Artifact Reframe **Status: RESOLVED** The authors have successfully purged the "null proves smoothness" language and accurately reframed the accountant-level BD/McCrary null finding around its limited statistical power. * **Results IV-D.1:** The text now explicitly states that "at $N = 686$ accountants the BD/McCrary test has limited statistical power, so a non-rejection of the smoothness null does not by itself establish smoothness." * **Results IV-E:** The analysis correctly notes that the lack of a transition is "consistent with---though, at $N = 686$, not sufficient to affirmatively establish---clustered-but-smoothly-mixed accountant-level aggregates." * **Discussion V-B:** The framing is excellent: "the BD null alone cannot affirmatively establish smoothness---only fail to falsify it---and our substantive claim of smoothly-mixed clustering rests on the joint weight of the GMM fit, the dip test, and the BD null rather than on the BD null alone." * **Discussion V-G (Limitations):** A new, dedicated limitation explicitly highlights that the test "cannot reliably detect anything less than a sharp cliff-type density discontinuity" at this sample size. * **Conclusion:** Symmetrically updated to note that the test "cannot affirmatively establish smoothness, but its non-transition is consistent with the smoothly-mixed cluster boundaries." * **Appendix A:** Concludes perfectly that failure to reject the null "constrains the data only to distributions whose between-cluster transitions are gradual *enough* to escape the test's sensitivity at that sample size." The rewrite is exceptionally clean. It does not feel awkward or bolted-on. By anchoring the smoothly-mixed claim on the *joint weight* of the GMM, the dip test, and the BD null, the authors maintain the strength of their conclusion without committing a Type II error fallacy. ### B. Table X EER/FRR Removal **Status: RESOLVED** The tautological presentation of FRR against the byte-identical positive anchor has been entirely resolved. * **Table X:** The EER row and FRR column have been deleted. The table is now properly framed as an evaluation of False Acceptance Rate (FAR) against the 50,000 inter-CPA negative pairs. * **Table Note:** A clear, unambiguous table note has been added explaining *why* FRR is omitted ("the byte-identical subset has cosine $\approx 1$ by construction, so FRR against that subset is trivially $0$ at every threshold below $1$"). * **Methodology III-K & Results IV-G.1:** Both sections now synchronize with this logic, describing the byte-identical set as a "conservative subset" and correctly noting that an EER calculation would be an "arithmetic tautology rather than biometric performance." This change significantly hardens the paper. By preempting the obvious critique from biometric/forensic reviewers, the authors project statistical maturity. ### C. Section IV-I Narrative Safeguard & 15-Signature Footnote **Status: RESOLVED** Both minor narrative omissions have been addressed exactly as requested. * **Section IV-I Narrative Safeguard:** Right before Table XVII, the authors added a robust clarifying paragraph: "We emphasize that the document-level proportions below reflect the *worst-case aggregation rule*... Document-level rates therefore bound the share of reports in which *at least one* signature is non-hand-signed rather than the share in which *both* are." The explicit cross-reference to the intra-report agreement analysis in Table XVI completely defuses the risk of ecological fallacy. * **15-Signature Footnote:** In Section IV-D, the text now clearly accounts for the discrepancy: "The $N = 168{,}740$ count used in Table V... is $15$ signatures smaller than the $168{,}755$ CPA-matched count reported in Table III: these $15$ signatures belong to CPAs with exactly one signature in the entire corpus, for whom no same-CPA pairwise best-match statistic can be computed..." This effectively closes the arithmetic loop. --- ## 3. New Findings in v3.8 The rewrites in v3.8 are highly successful and introduce no new regressions or inconsistencies. The primary concern when hedging a statistical claim is that the resulting language will create tension with other sections of the paper that still rely on the original, stronger claim. The authors avoided this trap brilliantly. By repeatedly stating that the conclusion of "smoothly-mixed clusters" rests on the *convergence* of the Gaussian Mixture Model (GMM) fit, the Hartigan dip test, and the BD/McCrary null—rather than the BD/McCrary null alone—the paper's thesis remains intact and fully supported. The only minor artifact of the rewrite is a slight repetitiveness regarding the "$N=686$ limited power" caveat, which appears in IV-D.1, IV-E, V-B, V-G, the Conclusion, and Appendix A. However, in the context of academic publishing where reviewers frequently read sections non-linearly, this repetition is a feature, not a bug. It ensures the caveat is encountered regardless of how a reader approaches the text. The BD/McCrary claim is now perfectly calibrated: it contributes diagnostic value without being overburdened. --- ## 4. Final Submission Readiness **v3.8 is fully submission-ready.** The manuscript requires no further revisions (a v3.9 is not warranted). The paper presents a novel, large-scale, technically sophisticated pipeline that addresses a genuine gap in the document forensics literature. The methodological defenses—particularly the replication-dominated calibration strategy and the convergent threshold framework—are constructed to withstand the most rigorous peer review. The authors should proceed to submit to IEEE Access immediately.