Commit Graph

3 Commits

Author SHA1 Message Date
gbanyan 6946baa096 Paper A v3.6: codex round-5 quick-wins cleanup (Minor Revision)
Codex gpt-5.4 round-5 (codex_review_gpt54_v3_5.md) verdict was Minor
Revision - all v3.4 round-4 PARTIAL/UNFIXED items now confirmed
RESOLVED, including line-by-line recomputation of Table XI z/p
matching the manuscript values. This commit cleans the remaining
quick-win items:

Table IX numerical sync to Script 24 authoritative values
- Five count corrections: cos>0.837 (60,405->60,408), cos>0.945
  (57,131/94.52% -> 56,836/94.02%, was 295 sigs / 0.50 pp off),
  cos>0.973 (48,910/80.91% -> 48,028/79.45%, was 882 sigs / 1.46 pp
  off), cos>0.95 (55,916->55,922), dh<=8 (57,521->57,527),
  dh<=15 (60,345->60,348), dual (54,373->54,370).
- Threshold label cos>0.941 -> cos>0.9407 (use exact calib-fold P5
  rather than rounded value).
- "dHash_indep <= 5 (calib-fold median-adjacent)" relabeled to
  "(whole-sample upper-tail of mode)" to match what III-L explains.
- Added "(operational dual)" / "(style-consistency boundary)" labels
  for unambiguous mapping into III-L category definitions.
- Removed circularity-language footnote inside the table comment.

Circularity overclaim removed paper-wide
- Methodology III-K (Section 3 anchor): "we break the resulting
  circularity" -> "we make the within-Firm-A sampling variance
  visible".
- Results IV-G.2 subsection title: "(breaks calibration-validation
  circularity)" -> "(within-Firm-A sampling variance disclosure)".
- Combined with the v3.5 Abstract / Conclusion edits, no surviving
  use of circular* anywhere in the paper.

export_v3.py title page now single-anonymized
- Removed "[Authors removed for double-blind review]" placeholder
  (IEEE Access uses single-anonymized review).
- Replaced with explicit "[AUTHOR NAMES - fill in before submission]"
  + affiliation placeholder so the requirement is unmissable.
- Subtitle now reads "single-anonymized review".

III-G stale "cosine-conditional dHash" sentence removed
- After the v3.5 III-L rewrite to dh_indep, the sentence at
  Methodology L131 referencing "cosine-conditional dHash used as a
  diagnostic elsewhere" no longer described any current paper usage.
- Replaced with a positive statement that dh_indep is the dHash
  statistic used throughout the operational classifier and all
  reported capture-rate analyses.

Abstract trimmed 247 -> 242 words for IEEE 250-word safety margin
- "an end-to-end pipeline" -> "a pipeline"; "Unlike signature
  forgery" -> "Unlike forgery"; "we report" passive recast; small
  conjunction trims.

Outstanding items deferred (require user decision / larger scope):
- BD/McCrary either substantiate (Z/p table + bin-width robustness)
  or demote to supplementary diagnostic.
- Visual-inspection protocol disclosure (sample size, rater count,
  blinding, adjudication rule).
- Reproducibility appendix (VLM prompt, HSV thresholds, seeds, EM
  init / stopping / boundary handling).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 12:41:11 +08:00
gbanyan 12f716ddf1 Paper A v3.5: resolve codex round-4 residual issues
Fully addresses the partial-resolution / unfixed items from codex
gpt-5.4 round-4 review (codex_review_gpt54_v3_4.md):

Critical
- Table XI z/p columns now reproduce from displayed counts. Earlier
  table had 1-4-unit transcription errors in k values and a fabricated
  cos > 0.9407 calibration row; both fixed by rerunning Script 24
  with cos = 0.9407 added to COS_RULES and copying exact values from
  the JSON output.
- Section III-L classifier now defined entirely in terms of the
  independent-minimum dHash statistic that the deployed code (Scripts
  21, 23, 24) actually uses; the legacy "cosine-conditional dHash"
  language is removed. Tables IX, XI, XII, XVI are now arithmetically
  consistent with the III-L classifier definition.
- "0.95 not calibrated to Firm A" inconsistency reconciled: Section
  III-H now correctly says 0.95 is the whole-sample Firm A P95 of the
  per-signature cosine distribution, matching III-L and IV-F.

Major
- Abstract trimmed to 246 words (from 367) to meet IEEE Access 250-word
  limit. Removed "we break the circularity" overclaim; replaced with
  "report capture rates on both folds with Wilson 95% intervals to
  make fold-level variance visible".
- Conclusion mirrors the Abstract reframe: 70/30 split documents
  within-firm sampling variance, not external generalization.
- Introduction no longer promises precision / F1 / EER metrics that
  Methods/Results don't deliver; replaced with anchor-based capture /
  FAR + Wilson CI language.
- Section III-G within-auditor-year empirical-check wording corrected:
  intra-report consistency (IV-H.3) is a different test (two co-signers
  on the same report, firm-level homogeneity) and is not a within-CPA
  year-level mixing check; the assumption is maintained as a bounded
  identification convention.
- Section III-H "two analyses fully threshold-free" corrected to "only
  the partner-level ranking is threshold-free"; longitudinal-stability
  uses 0.95 cutoff, intra-report uses the operational classifier.

Minor
- Impact Statement removed from export_v3.py SECTIONS list (IEEE Access
  Regular Papers do not have a standalone Impact Statement). The file
  itself is retained as an archived non-paper note for cover-letter /
  grant-report reuse, with a clear archive header.
- All 7 previously unused references ([27] dHash, [31][32] partner-
  signature mandates, [33] Taiwan partner rotation, [34] YOLO original,
  [35] VLM survey, [36] Mann-Whitney) are now cited in-text:
    [27] in Methodology III-E (dHash definition)
    [31][32][33] in Introduction (audit-quality regulation context)
    [34][35] in Methodology III-C/III-D
    [36] in Results IV-C (Mann-Whitney result)

Updated Script 24 to include cos = 0.9407 in COS_RULES so Table XI's
calibration-fold P5 row is computed from the same data file as the
other rows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 12:23:03 +08:00
gbanyan 9b11f03548 Paper A v3: full rewrite for IEEE Access with three-method convergence
Major changes from v2:

Terminology:
- "digitally replicated" -> "non-hand-signed" throughout (per partner v3
  feedback and to avoid implicit accusation)
- "Firm A near-universal non-hand-signing" -> "replication-dominated"
  (per interview nuance: most but not all Firm A partners use replication)

Target journal: IEEE TAI -> IEEE Access (per NCKU CSIE list)

New methodological sections (III.G-III.L + IV.D-IV.G):
- Three convergent threshold methods (KDE antimode + Hartigan dip test /
  Burgstahler-Dichev McCrary / EM-fitted Beta mixture + logit-GMM
  robustness check)
- Explicit unit-of-analysis discussion (signature vs accountant)
- Accountant-level 2D Gaussian mixture (BIC-best K=3 found empirically)
- Pixel-identity validation anchor (no manual annotation needed)
- Low-similarity negative anchor + Firm A replication-dominated anchor

New empirical findings integrated:
- Firm A signature cosine UNIMODAL (dip p=0.17) - long left tail = minority
  hand-signers
- Full-sample cosine MULTIMODAL but not cleanly bimodal (BIC prefers 3-comp
  mixture) - signature-level is continuous quality spectrum
- Accountant-level mixture trimodal (C1 Deloitte-heavy 139/141,
  C2 other Big-4, C3 smaller firms). 2-comp crossings cos=0.945, dh=8.10
- Pixel-identity anchor (310 pairs) gives perfect recall at all cosine
  thresholds
- Firm A anchor rates: cos>0.95=92.5%, dual-rule cos>0.95 AND dh<=8=89.95%

New discussion section V.B: "Continuous-quality spectrum vs discrete-
behavior regimes" - the core interpretive contribution of v3.

References added: Hartigan & Hartigan 1985, Burgstahler & Dichev 1997,
McCrary 2008, Dempster-Laird-Rubin 1977, White 1982 (refs 37-41).

export_v3.py builds Paper_A_IEEE_Access_Draft_v3.docx (462 KB, +40% vs v2
from expanded methodology + results sections).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 00:14:47 +08:00