Paper A v3.7: demote BD/McCrary to density-smoothness diagnostic; add Appendix A

Implements codex gpt-5.4 recommendation (paper/codex_bd_mccrary_opinion.md, "option (c) hybrid"): demote BD/McCrary in the main text from a co-equal threshold estimator to a density-smoothness diagnostic, and add a bin-width sensitivity appendix as an audit trail. Why: the bin-width sweep (Script 25) confirms that at the signature level the BD transition drifts monotonically with bin width (Firm A cosine: 0.987 -> 0.985 -> 0.980 -> 0.975 as bin width widens 0.003 -> 0.015; full-sample dHash transitions drift from 2 to 10 to 9 across bin widths 1 / 2 / 3) and Z statistics inflate superlinearly with bin width, both characteristic of a histogram-resolution artifact. At the accountant level the BD null is robust across the sweep. The paper's earlier "three methodologically distinct estimators" framing therefore could not be defended to an IEEE Access reviewer once the sweep was run. Added - signature_analysis/25_bd_mccrary_sensitivity.py: bin-width sweep across 6 variants (Firm A / full-sample / accountant-level, each cosine + dHash_indep) and 3-4 bin widths per variant. Reports Z_below, Z_above, p-values, and number of significant transitions per cell. Writes reports/bd_sensitivity/bd_sensitivity.{json,md}. - paper/paper_a_appendix_v3.md: new "Appendix A. BD/McCrary Bin-Width Sensitivity" with Table A.I (all 20 sensitivity cells) and interpretation linking the empirical pattern to the main-text framing decision. - export_v3.py: appendix inserted into SECTIONS between conclusion and references. - paper/codex_bd_mccrary_opinion.md: codex gpt-5.4 recommendation captured verbatim for audit trail. Main-text reframing - Abstract: "three methodologically distinct estimators" -> "two estimators plus a Burgstahler-Dichev/McCrary density- smoothness diagnostic". Trimmed to 243 words. - Introduction: related-work summary, pipeline step 5, accountant- level convergence sentence, contribution 4, and section-outline line all updated. Contribution 4 renamed to "Convergent threshold framework with a smoothness diagnostic". - Methodology III-I: section renamed to "Convergent Threshold Determination with a Density-Smoothness Diagnostic". "Method 2: BD/McCrary Discontinuity" converted to "Density-Smoothness Diagnostic" in a new subsection; Method 3 (Beta mixture) renumbered to Method 2. Subsections 4 and 5 updated to refer to "two threshold estimators" with BD as diagnostic. - Methodology III-A pipeline overview: "three methodologically distinct statistical methods" -> "two methodologically distinct threshold estimators complemented by a density-smoothness diagnostic". - Methodology III-L: "three-method analysis" -> "accountant-level threshold analysis (KDE antimode, Beta-2 crossing, logit-Gaussian robustness crossing)". - Results IV-D.1 heading: "BD/McCrary Discontinuity" -> "BD/McCrary Density-Smoothness Diagnostic". Prose now notes the Appendix-A bin-width instability explicitly. - Results IV-E: Table VIII restructured to label BD rows "(diagnostic only; bin-unstable)" and "(diagnostic; null across Appendix A)". Summary sentence rewritten to frame BD null as evidence for clustered-but-smoothly-mixed rather than as a convergence failure. Table cosine P5 row corrected from 0.941 to 0.9407 to match III-K. - Results IV-G.3 and IV-I.2: "three-method convergence/thresholds" -> "accountant-level convergent thresholds" (clarifies the 3 converging estimates are KDE antimode, Beta-2, logit-Gaussian, not KDE/BD/Beta). - Discussion V-B: "three-method framework" -> "convergent threshold framework". - Conclusion: "three methodologically distinct methods" -> "two threshold estimators and a density-smoothness diagnostic"; contribution 3 restated; future-work sentence updated. - Impact Statement (archived): "three methodologically distinct threshold-selection methods" -> "two methodologically distinct threshold estimators plus a density-smoothness diagnostic" so the archived text is internally consistent if reused. Discussion V-B / V-G already framed BD as a diagnostic in v3.5 (unchanged in this commit). The reframing therefore brings Abstract / Introduction / Methodology / Results / Conclusion into alignment with the Discussion framing that codex had already endorsed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:32:50 +08:00
parent 6946baa096
commit 552b6b80d4
11 changed files with 458 additions and 63 deletions
@@ -0,0 +1,9 @@
+# Codex gpt-5.4 Opinion on BD/McCrary Option (a) vs (b)
+
+(read-only sandbox; full text captured verbatim from codex output log)
+
+**Recommendation: option (b) DEMOTE**, with a preferred hybrid (c) if time permits.
+
+---
+
+For an IEEE Access reviewer, option (b) is the more defensible choice. At this point the paper's own evidence no longer supports BD/McCrary as a co-equal threshold estimator: at the accountant level, which is now the methodologically decisive level for inference, it produces no significant transition at all, while at the signature level the reported cosine transition is 0.985 with very large adjacent Z values, meaning the procedure is finding a sharp local density irregularity inside the non-hand-signed mode rather than a boundary between the two mechanisms the paper is supposed to separate. That is the central problem. If BD remains framed in the Abstract, Introduction, and Section III-I as one of three threshold estimators, the natural reviewer response is not "good triangulation" but "why do the estimators fail to converge around the accountant-level band of roughly 0.976 +/- 0.003?" and the manuscript has no persuasive answer beyond "BD is different." The missing bin-width robustness makes that vulnerability worse, not better: with a fixed 0.005 cosine bin width on a very large sample, the present signature-level transition could reflect a real local feature, a histogram-resolution artifact, or both, and running the sweep now creates asymmetric downside risk because instability would directly weaken Method 2 while stability still would not solve the deeper interpretability problem that the transition sits within, not between, modes. By contrast, option (b) aligns the front half of the paper with what the Discussion already correctly says in Sections V-B and V-G: BD/McCrary is informative here as a density-smoothness diagnostic, not as an independent accountant-level threshold setter. That reframing actually sharpens the paper's substantive claim. The coherent story is that accountant-level aggregates are structured enough for KDE and mixture methods to yield convergent thresholds, yet smooth enough that a discontinuity-based method does not identify a sharp density break; this supports "clustered but smoothly mixed" behavior better than the current "three estimators" rhetoric does. A third option the author has not explicitly considered is a hybrid: demote BD in the main text exactly as in option (b), but run a short bin-width sweep and place the results in an appendix or supplement as an audit trail. That would let the authors say, in one sentence, either that the signature-level transition is not robust to binning or that it is bin-stable but still diagnostically located at 0.985 and therefore not used as the accountant-level threshold. In my view that hybrid is the strongest version if time permits; but if the choice is strictly between (a) and (b), I would recommend (b) without hesitation.