pdf_signature_extraction/paper/paper_a_appendix_v3.md

# Appendix A. BD/McCrary Bin-Width Sensitivity (Signature Level)

The main text (Section III-I, Section IV-D.2) treats the Burgstahler-Dichev / McCrary discontinuity procedure [38], [39] as a *density-smoothness diagnostic* rather than as a threshold estimator.
This appendix documents the empirical basis for that framing by sweeping the bin width across four (variant, bin-width) panels: Firm A and full-sample, each in the cosine and $\text{dHash}_\text{indep}$ direction.

<!-- TABLE A.I: BD/McCrary Bin-Width Sensitivity (two-sided alpha = 0.05, |Z| > 1.96)
| Variant | n | Bin width | Best transition | z_below | z_above |
|---------|---|-----------|-----------------|---------|---------|
| Firm A cosine (sig-level)        | 60,448  | 0.003  | 0.9870 | -2.81  | +9.42   |
| Firm A cosine (sig-level)        | 60,448  | 0.005  | 0.9850 | -9.57  | +19.07  |
| Firm A cosine (sig-level)        | 60,448  | 0.010  | 0.9800 | -54.64 | +69.96  |
| Firm A cosine (sig-level)        | 60,448  | 0.015  | 0.9750 | -85.86 | +106.17 |
| Firm A dHash_indep (sig-level)   | 60,448  | 1      | 2.0    | -4.69  | +10.01  |
| Firm A dHash_indep (sig-level)   | 60,448  | 2      | no transition | — | — |
| Firm A dHash_indep (sig-level)   | 60,448  | 3      | no transition | — | — |
| Full-sample cosine (sig-level)   | 168,740 | 0.003  | 0.9870 | -3.21  | +8.17   |
| Full-sample cosine (sig-level)   | 168,740 | 0.005  | 0.9850 | -8.80  | +14.32  |
| Full-sample cosine (sig-level)   | 168,740 | 0.010  | 0.9800 | -29.69 | +44.91  |
| Full-sample cosine (sig-level)   | 168,740 | 0.015  | 0.9450 | -11.35 | +14.85  |
| Full-sample dHash_indep (sig-l.) | 168,740 | 1      | 2.0    | -6.22  | +4.89   |
| Full-sample dHash_indep (sig-l.) | 168,740 | 2      | 10.0   | -7.35  | +3.83   |
| Full-sample dHash_indep (sig-l.) | 168,740 | 3      | 9.0    | -11.05 | +45.39  |
-->

Two patterns are visible in Table A.I.
First, the procedure consistently identifies a "transition" under every bin width, but the *location* of that transition drifts monotonically with bin width (Firm A cosine: 0.987 → 0.985 → 0.980 → 0.975 as bin width grows from 0.003 to 0.015; full-sample dHash: 2 → 10 → 9 as the bin width grows from 1 to 3).
The $Z$ statistics also inflate superlinearly with the bin width (Firm A cosine $|Z|$ rises from $\sim 9$ at bin 0.003 to $\sim 106$ at bin 0.015) because wider bins aggregate more mass per bin and therefore shrink the per-bin standard error on a very large sample.
Both features are characteristic of a histogram-resolution artifact rather than of a genuine density discontinuity.

Second, the candidate transitions all locate *inside* the high-similarity region (cosine $\geq 0.975$, dHash $\leq 10$) rather than at a between-mode boundary, which is the location pattern we would expect of a clean within-population antimode.

Taken together, Table A.I shows that the signature-level BD/McCrary transitions are not a threshold in the usual sense---they are histogram-resolution-dependent local density anomalies located *inside* the non-hand-signed mode rather than between modes.
This observation supports the main-text decision to use BD/McCrary as a density-smoothness diagnostic rather than as a threshold estimator and reinforces the joint reading of Section IV-D that the descriptor distributions do not contain a within-population bimodal antimode that could anchor an operational threshold.

Raw per-bin $Z$ sequences and $p$-values for every (variant, bin-width) panel are available in the supplementary materials.

# Appendix B. Reproducibility Materials

The full table-to-script provenance mapping, script source code, and report artefacts for every numerical table and figure in this paper are provided in the supplementary materials. Scripts run deterministically under fixed random seeds documented there; reviewer reproduction should re-emit artefacts from the listed scripts rather than rely on any local path layout.