Codex independent peer review (paper/codex_review_gpt55_v3_18_1.md) audited
empirical claims against scripts/JSON reports rather than rubber-stamping
prior Accept verdicts. Verdict: Minor Revision. This commit addresses every
flagged item.
- Soften mechanism-identification language (Results IV-D.1, Discussion B):
per-signature cosine "fails to reject unimodality" rather than "reflects a
single dominant generative mechanism"; framing tied to joint evidence.
- Replace overabsolute "single stored image" with multi-template phrasing
in Introduction and Methodology III-A.
- Reframe Methodology III-H so practitioner knowledge is non-load-bearing;
evidentiary basis is the paper's own image evidence.
- Fix stale section cross-references after the v3.18 retitling: IV-F.* ->
IV-G.* in 11 locations across methodology and results.
- Fix 0.941 / 0.945 / 0.9407 wording in Methodology III-K to use the
calibration-fold P5 = 0.9407 and the rounded sensitivity cut 0.945.
- Soften "sharp discontinuity" in Results IV-G.3 to "23-28 percentage-point
gap consistent with firm-wide non-hand-signing practice".
- Soften Conclusion's "directly generalizable" with explicit conditions on
analogous anchors and artifact-generation physics.
- Add Appendix B: table-to-script provenance map (15 manuscript tables
mapped to generating scripts and JSON report artifacts).
- New script signature_analysis/28_byte_identity_decomposition.py produces
reproducible artifacts for two previously-unverified claims:
(a) 145 / 50 / 180 / 35 Firm A byte-identity decomposition (verified);
(b) cross-firm dual-descriptor convergence -- corrected from the previous
manuscript text "non-Firm-A 11.3% vs Firm A 58.7% (5x)" to the
database-verified "non-Firm-A 42.12% vs Firm A 88.32% (~2.1x)".
- Clarify scripts 19 / 21 docstrings: legacy EER / FRR / Precision / F1
helpers are retained for diagnostic use only and are NOT cited as
biometric performance in the paper. Remove "interview evidence" wording.
- Rebuild Paper_A_IEEE_Access_Draft_v3.docx.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6.7 KiB
Appendix A. BD/McCrary Bin-Width Sensitivity (Signature Level)
The main text (Section III-I, Section IV-D.2) treats the Burgstahler-Dichev / McCrary discontinuity procedure [38], [39] as a density-smoothness diagnostic rather than as a threshold estimator.
This appendix documents the empirical basis for that framing by sweeping the bin width across four (variant, bin-width) panels: Firm A and full-sample, each in the cosine and \text{dHash}_\text{indep} direction.
Two patterns are visible in Table A.I.
First, the procedure consistently identifies a "transition" under every bin width, but the location of that transition drifts monotonically with bin width (Firm A cosine: 0.987 → 0.985 → 0.980 → 0.975 as bin width grows from 0.003 to 0.015; full-sample dHash: 2 → 10 → 9 as the bin width grows from 1 to 3).
The Z statistics also inflate superlinearly with the bin width (Firm A cosine |Z| rises from \sim 9 at bin 0.003 to \sim 106 at bin 0.015) because wider bins aggregate more mass per bin and therefore shrink the per-bin standard error on a very large sample.
Both features are characteristic of a histogram-resolution artifact rather than of a genuine density discontinuity.
Second, the candidate transitions all locate inside the non-hand-signed mode (cosine \geq 0.975, dHash \leq 10) rather than between modes, which is the location pattern we would expect of a clean two-mechanism boundary.
Taken together, Table A.I shows that the signature-level BD/McCrary transitions are not a threshold in the usual sense---they are histogram-resolution-dependent local density anomalies located inside the non-hand-signed mode rather than between modes. This observation supports the main-text decision to use BD/McCrary as a density-smoothness diagnostic rather than as a threshold estimator and reinforces the joint reading of Section IV-D that per-signature similarity does not form a clean two-mechanism mixture.
Raw per-bin Z sequences and $p$-values for every (variant, bin-width) panel are available in the supplementary materials.
Appendix B. Table-to-Script Provenance
For reproducibility, the following table maps each numerical table in Section IV to the analysis script that produces its underlying values and to the JSON / Markdown report file emitted by that script. Scripts referenced are under signature_analysis/ and reports under the project's reports/ tree.
The table-to-script mapping above is intended as a navigation aid for replicators. All scripts run deterministically under the fixed random seeds documented in the supplementary materials; report files are committed alongside the scripts so that each numerical claim in Section IV traces to a specific JSON field rather than to an undocumented intermediate computation.