b6913d2f93
Structural: - Promote operational classifier definition from §III-L.0 to new §III-H.1, so the reader meets the five-way HC/MC/HSC/UN/LH rule before the §III-I/J/K diagnostic chain instead of ~130 lines after. §III-L renamed to "Anchor-Based Threshold Calibration"; §III-L.0 retains only calibration methodology, three units of analysis, any-pair semantics, and the FAR terminological note. §III-L.7 deleted (redundant with §III-J). - Reorganise §V-H Limitations into Primary / Secondary / Documented features / Engineering groupings (was a flat 14-item list). - Reframe §III-M from "ten-tool unsupervised-validation collection" to "each diagnostic addresses one specific unsupervised failure mode"; rename "What v4.0 does/does not claim" → "Limits / Scope of the present analysis"; retitle Table XXVII. Framing alignment (cross-section): - Strip all v3.x / v4.0 / v3.20 / v4-new / inherited lineage labels from rendered text (Abstract, Intro, §II, §III, §IV, §V, §VI, Appendix, Impact). - Replace "Paper A" rule references with "deployed" rule references. - Soften "validation" to "characterise" / "check" / "screening label" / "consistency check" / "support"; "verdict" → "screening label". - Remove codex-verified spike claims (non-Big-4 jittered dHash, Big-4 pooled cosine after firm-mean centring). Only formally scripted evidence (Scripts 39b–39e) retained; non-Big-4 evidence framed as corroborating raw-axis cosine, not as calibration evidence. - Strip script-provenance parentheticals from Introduction; defer Script 39c internal references and similar to Methodology / Appendix. Numerical / table fixes: - §III-C document-count arithmetic: 12 corrupted → 13 corrupted/unreadable, verified against sqlite DB and total-pdf/ folder counts (90,282 - 4,198 no-sig - 13 corrupted = 86,071 → 85,042 with detections → 182,328 sigs → 168,755 CPA-matched). Table I shows VLM-positive (86,084) and processed-for-extraction (86,071) as separate rows. - Wilson 95% CIs added for joint-rule ICCR rows in Table XXI / methodology table ([0.00011, 0.00018] and [0.00008, 0.00014]). - Unit error fixed: 0.3856 pp / 0.4431 pp → 0.3856 (38.6 pp) / 0.4431 (44.3 pp). Smaller revisions: - Pipeline framing: "detecting" → "screening" in Abstract / Intro / Conclusion for consistency with the unsupervised-screening positioning. - "hard ground-truth subset" → "conservative hard-positive subset" throughout. - §III-F SSIM / pixel-comparison rebuttal compressed from ~15 lines to 4; design-level argument deferred to supplementary materials. - "stakeholders can adopt / can derive thresholds" → "alternative operating points can be characterised by inverting" (less prescriptive). - "the same mechanism extending in milder form to Firms B/C/D" → "similar, milder production-related reuse patterns at Firms B/C/D" (mechanism claim softened). - Appendix A "non-hand-signed mode" / "two-mechanism mixture" lineage language aligned with v4 framing. Appendix B: - Rebuilt as a redirect-only stub. The HTML-commented obsolete table mapping (Table IX–XVIII labels with FAR / capture-rate / validation language) is removed; replaced with a short paragraph pointing to supplementary materials for full table-to-script provenance. Cross-references: - All §III-L references for the rule definition retargeted to §III-H.1; references for calibration still point to §III-L. - §III-H references for byte-level Firm A evidence / non-Big-4 reverse anchor retargeted to §III-H.2. Artefacts: - Combined manuscript regenerated: paper_a_v4_combined.md, 1314 lines (was 1346 pre-review). - Two review handoff documents added: paper/review_handoff_abstract_intro_20260515.md paper/review_handoff_body_20260515.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
40 lines
3.9 KiB
Markdown
40 lines
3.9 KiB
Markdown
# Appendix A. BD/McCrary Bin-Width Sensitivity (Signature Level)
|
|
|
|
The main text (Section III-I, Section IV-D.2) treats the Burgstahler-Dichev / McCrary discontinuity procedure [38], [39] as a *density-smoothness diagnostic* rather than as a threshold estimator.
|
|
This appendix documents the empirical basis for that framing by sweeping the bin width across four (variant, bin-width) panels: Firm A and full-sample, each in the cosine and $\text{dHash}_\text{indep}$ direction.
|
|
|
|
<!-- TABLE A.I: BD/McCrary Bin-Width Sensitivity (two-sided alpha = 0.05, |Z| > 1.96)
|
|
| Variant | n | Bin width | Best transition | z_below | z_above |
|
|
|---------|---|-----------|-----------------|---------|---------|
|
|
| Firm A cosine (sig-level) | 60,448 | 0.003 | 0.9870 | -2.81 | +9.42 |
|
|
| Firm A cosine (sig-level) | 60,448 | 0.005 | 0.9850 | -9.57 | +19.07 |
|
|
| Firm A cosine (sig-level) | 60,448 | 0.010 | 0.9800 | -54.64 | +69.96 |
|
|
| Firm A cosine (sig-level) | 60,448 | 0.015 | 0.9750 | -85.86 | +106.17 |
|
|
| Firm A dHash_indep (sig-level) | 60,448 | 1 | 2.0 | -4.69 | +10.01 |
|
|
| Firm A dHash_indep (sig-level) | 60,448 | 2 | no transition | — | — |
|
|
| Firm A dHash_indep (sig-level) | 60,448 | 3 | no transition | — | — |
|
|
| Full-sample cosine (sig-level) | 168,740 | 0.003 | 0.9870 | -3.21 | +8.17 |
|
|
| Full-sample cosine (sig-level) | 168,740 | 0.005 | 0.9850 | -8.80 | +14.32 |
|
|
| Full-sample cosine (sig-level) | 168,740 | 0.010 | 0.9800 | -29.69 | +44.91 |
|
|
| Full-sample cosine (sig-level) | 168,740 | 0.015 | 0.9450 | -11.35 | +14.85 |
|
|
| Full-sample dHash_indep (sig-l.) | 168,740 | 1 | 2.0 | -6.22 | +4.89 |
|
|
| Full-sample dHash_indep (sig-l.) | 168,740 | 2 | 10.0 | -7.35 | +3.83 |
|
|
| Full-sample dHash_indep (sig-l.) | 168,740 | 3 | 9.0 | -11.05 | +45.39 |
|
|
-->
|
|
|
|
Two patterns are visible in Table A.I.
|
|
First, the procedure consistently identifies a "transition" under every bin width, but the *location* of that transition drifts monotonically with bin width (Firm A cosine: 0.987 → 0.985 → 0.980 → 0.975 as bin width grows from 0.003 to 0.015; full-sample dHash: 2 → 10 → 9 as the bin width grows from 1 to 3).
|
|
The $Z$ statistics also inflate superlinearly with the bin width (Firm A cosine $|Z|$ rises from $\sim 9$ at bin 0.003 to $\sim 106$ at bin 0.015) because wider bins aggregate more mass per bin and therefore shrink the per-bin standard error on a very large sample.
|
|
Both features are characteristic of a histogram-resolution artifact rather than of a genuine density discontinuity.
|
|
|
|
Second, the candidate transitions all locate *inside* the high-similarity region (cosine $\geq 0.975$, dHash $\leq 10$) rather than at a between-mode boundary, which is the location pattern we would expect of a clean within-population antimode.
|
|
|
|
Taken together, Table A.I shows that the signature-level BD/McCrary transitions are not a threshold in the usual sense---they are histogram-resolution-dependent local density anomalies located *inside* the non-hand-signed mode rather than between modes.
|
|
This observation supports the main-text decision to use BD/McCrary as a density-smoothness diagnostic rather than as a threshold estimator and reinforces the joint reading of Section IV-D that the descriptor distributions do not contain a within-population bimodal antimode that could anchor an operational threshold.
|
|
|
|
Raw per-bin $Z$ sequences and $p$-values for every (variant, bin-width) panel are available in the supplementary materials.
|
|
|
|
# Appendix B. Reproducibility Materials
|
|
|
|
The full table-to-script provenance mapping, script source code, and report artefacts for every numerical table and figure in this paper are provided in the supplementary materials. Scripts run deterministically under fixed random seeds documented there; reviewer reproduction should re-emit artefacts from the listed scripts rather than rely on any local path layout.
|