Files
pdf_signature_extraction/paper
gbanyan 3c7fcc010f Paper A v4.1: BCD-baseline reframe + screening positioning + trim
- Re-anchor inter-CPA coincidence-rate (ICCR) calibration on a normative
  non-Firm-A baseline (Firms B/C/D); Firm A held out as an out-of-sample
  target. Locked canonical numbers (codex-audited; Scripts 46/52/53):
  per-comparison HC 0.00014->0.000018, per-signature HC 0.0116, per-document
  HC+MC 0.34->0.1905; KDE crossover 0.837 retained corpus-wide.
- Reposition as an operator-tunable, semi-automated screening/triage framework
  (title -> "Automated Screening..."): HC = high-specificity operating point;
  MC band demoted to low-specificity advisory; Firm A = demonstration that the
  screening surfaces a templated end, audit-quality implications deferred.
- Apply codex prose-review fixes: triage-neutral five-way labels, soften
  mechanism/specificity wording, supersede MC claim-strength, update stale
  Appendix script references (40b/43/45 -> 46/52/53).
- Trim pass: compress Sec. V discussion + Sec. III echoes (27.7k -> 26.8k
  words); no substantive content removed.
- Add analysis scripts 45-53 (firm-year trends; BCD-only ICCR recompute;
  canonical-sampler locked numbers; Firm-A out-of-sample; BCD regression +
  cross-firm hit matrix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 19:35:10 +08:00
..