Commit Graph

4 Commits

Author SHA1 Message Date
gbanyan 2a13f0d985 Paper A v13 rev9.1: HC-meaning + same-pair table + interview/framing rebalance, plus typesetting polish
Respond to a second hostile GPT-5.5 reviewer pass on rev9. Four substantive
changes plus accumulated typesetting polish.

Reviewer points addressed:
- HC != reuse (Fatal 1): new Sec III-F "What HC Means and Does Not Mean" states
  plainly that HC denotes an extreme within-accountant repetition pattern that is
  rare between unrelated accountants, not a reuse label; reuse is one
  interpretation, carried at Firm A by byte-identity + context, never implied by
  HC alone; no reuse claim is made for Firms B/C/D.
- Any-pair construction (Fatal 2): new Table VI gives the per-signature HC flag
  rate by firm under the deployed any-pair rule vs the strict same-pair rule
  (cosine and dHash from the same partner). Same-pair lowers all rates but widens
  the firm gap: Firm A 57.3% vs baseline 5-9%, ratio 2.4-3.4x -> 6.4-10.8x, so
  the HC region is not an artefact of combining extrema from different pairs.
  Reproducible via samepair_hc.py (Hamming on stored dHash vectors).
- Interviews (Fatal 3): Sec III-A now states the interviews are used only to
  contextualize, are corroborative not confirmatory and not independently
  reproducible; their one load-bearing use (Firm A as known-positive benchmark)
  lowers rather than raises the claim. Empirical claims rest on calibration +
  byte-identity, which stand without them.
- Framing (Fatal 4, rebalance not relabel): contribution 3 elevated to the
  methodological core (label-free construction/characterization of an operating
  point without labels), explicitly demonstrated/stress-tested on audit
  signatures "rather than a finished, fully general framework." The audit finding
  is kept as a headline result, not demoted to a mere case study, and no
  general-framework claim is made.

Typesetting polish (verified by rendering pages to images):
- Unify scientific notation in Table II ([4x10^-6, 2.3x10^-5]).
- Tighten Table II row labels to cut excessive wrapping (3 lines -> 2).
- Fix duplicated figure captions (empty image alt-text so pandoc no longer
  auto-captions on top of the hand-written caption); unify caption punctuation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Qn59FdF9JMyfFg3sjcUNNG
2026-06-23 15:37:13 +08:00
gbanyan cb38d413ad Paper A v13 rev9: sensitivity surface + honesty fixes (GPT-5.5 hostile review)
Pre-emptively address the three residual points from a hostile GPT-5.5
reviewer pass that rev8 had not fully closed (the rest of that review
matched the already-applied fusion revision):

- Sensitivity surface (Major 5): new Figure 6 maps the deployed rule over
  the full (cosine cut x dHash cut) plane - clean-group flag rate and the
  Firm A-minus-B/C/D contrast. Shows no cliff at (0.95, dHash<=5), contrast
  >45pp across a broad region (58pp at 0.97/dHash<=3), and that extending to
  the MC bound (dHash<=15) halves the contrast - so the thresholds are not
  cherry-picked and the weaker MC band is shown, not hidden. Reproducible
  via make_fig6_sensitivity.py (DB columns only).

- Soften "reuse-dominated" (Major 1): the assertion that Firm A "is" a
  reuse-dominated population now reads "behaves in the screen as," explicitly
  resting on interviews + byte-identity rather than per-signature ground
  truth; two other uses made conditional/generic.

- Shared-pipeline contamination of ICCR (Major 2): Sec III-E now names the
  shared within-firm imaging pipeline (scanners, PDF assembly, red-stamp
  removal) as a channel that can lift the inter-CPA rate above true chance,
  distinct from "shared template," supported by the Sec V-B pipeline audit;
  bias direction (higher floor) keeps the Firm-A contrast conservative.

rev9 docx rebuilt (6 figures embedded).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Qn59FdF9JMyfFg3sjcUNNG
2026-06-23 14:59:55 +08:00
gbanyan da455791de Paper A v13 rev8: fusion-review revision (29 items) + verified data analysis
Address all 29 items from the fused reviewer report (Gemini 3.1 Pro +
ChatGPT 5.5 + Opus 4.8): 3 fatal, 4 severe, arbitration A/B, 5 fusion-new,
15 minor. All new numbers computed from signature_analysis.db; nothing
fabricated.

Claim honesty (F1/F3/F4/F7/G3):
- Retract all "139x the floor" comparisons; ICCR -> between-accountant
  specificity proxy throughout; state within-accountant FPR is not
  estimable and ICCR is not even a bound (anti-conservative direction).
- Firm A reframed as quasi-positive known-positive benchmark (not blinded).
- byte-identity recast as prevalence signal, not a recall/sanity check.
- tunable -> single-direction conservativeness dial (no P-R frontier).

New data analysis (verified, bit-reproducible via committed scripts):
- F2/G1 (Sec V-B): 880-PDF imaging-pipeline audit (Table V) - plain scans
  82% (2013) -> 1% (2021); producer strings name scanner hardware
  (Fuji Xerox D125 etc.); substrate transforms at 2020/21 = named confound.
- F5 (Sec IV-C): four robustness checks - pool-size stratification,
  accountant-clustered bootstrap (gap 53.7pp [49.5,57.5]), firm+year FE
  logistic (B/C/D OR 0.06-0.12), leave-one-year-out (gap 53.1-54.9pp).
- byte-identity era split: 30 scan-era (18 Firm A, pipeline-robust) vs
  232 digital-era (detectability-inflated, hedged).
- G5: archive-wide 888 expected chance HC flags [677,1098].
- M4: Figure 3 replaced with real 2D density (n=150,441).

Structure/minor: abstract restructured (M1); operational definition (M2);
interview disclaimer (M3); Threats to Validity subsection (M8); review
protocol framed as design not evidence (M9); N reconciliations (M10/M11);
Table II-c 2020-23 five-way (M12); Section refs, American spelling,
notation table (M5/M13/M15); reference URLs verified (M14).

Open (author-only): placeholders (M13), II-b/IV table merge (M15).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Qn59FdF9JMyfFg3sjcUNNG
2026-06-23 14:36:51 +08:00
gbanyan 66c9194fcf Paper A v13: filled submission draft (rev7) + reproducible build bundle
Fill all 18 placeholders in the condensed v13 submission draft with
data verified against the analysis DB and LOCKED canonical scripts;
close 12/13 co-author review items (only #8b protocol first-run open).

Key changes (need co-author sign-off; see handoff doc):
- Firm A out-of-sample HC 0.01% -> 0.42% (buggy 0.0001 from Script 49
  same-pair bug, propagated v4.2->v13; never reuse 0.0001)
- §III-D empty cell ~=0 -> 7,681 honest reframe (not degenerate crops)
- low cosine cut 0.837 -> 0.8547 primary (BCD 2013-2019 closed-world,
  held-out discipline; 0.8489 confirmed = BCD all-period); HC/MC/HSC
  unchanged, UN/LH move <=0.4pp

Adds Figures 1-5 (real-data plots + schematics), full references,
Appendix A/B, UN/HSC ICCR, n-reconciliation, #13 MOPS-metadata
survival verification, "參" set-level feasibility probe (negative).
Two codex (gpt-5.5) adversarial rounds applied; no fabrication found.

Bundle: paper/v13_build/ (markdown source, harvest/figure scripts,
figures) for reproducibility. Handoff note for co-author included.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 03:24:50 +08:00