Phase 6 round-2 reviewer revisions: §III-H.1 promotion + framing alignment

Structural:
- Promote operational classifier definition from §III-L.0 to new §III-H.1, so
  the reader meets the five-way HC/MC/HSC/UN/LH rule before the §III-I/J/K
  diagnostic chain instead of ~130 lines after. §III-L renamed to
  "Anchor-Based Threshold Calibration"; §III-L.0 retains only calibration
  methodology, three units of analysis, any-pair semantics, and the FAR
  terminological note. §III-L.7 deleted (redundant with §III-J).
- Reorganise §V-H Limitations into Primary / Secondary / Documented features /
  Engineering groupings (was a flat 14-item list).
- Reframe §III-M from "ten-tool unsupervised-validation collection" to
  "each diagnostic addresses one specific unsupervised failure mode";
  rename "What v4.0 does/does not claim" → "Limits / Scope of the present
  analysis"; retitle Table XXVII.

Framing alignment (cross-section):
- Strip all v3.x / v4.0 / v3.20 / v4-new / inherited lineage labels from
  rendered text (Abstract, Intro, §II, §III, §IV, §V, §VI, Appendix, Impact).
- Replace "Paper A" rule references with "deployed" rule references.
- Soften "validation" to "characterise" / "check" / "screening label" /
  "consistency check" / "support"; "verdict" → "screening label".
- Remove codex-verified spike claims (non-Big-4 jittered dHash, Big-4 pooled
  cosine after firm-mean centring). Only formally scripted evidence
  (Scripts 39b–39e) retained; non-Big-4 evidence framed as corroborating
  raw-axis cosine, not as calibration evidence.
- Strip script-provenance parentheticals from Introduction; defer Script 39c
  internal references and similar to Methodology / Appendix.

Numerical / table fixes:
- §III-C document-count arithmetic: 12 corrupted → 13 corrupted/unreadable,
  verified against sqlite DB and total-pdf/ folder counts (90,282 - 4,198
  no-sig - 13 corrupted = 86,071 → 85,042 with detections → 182,328 sigs →
  168,755 CPA-matched). Table I shows VLM-positive (86,084) and
  processed-for-extraction (86,071) as separate rows.
- Wilson 95% CIs added for joint-rule ICCR rows in Table XXI / methodology
  table ([0.00011, 0.00018] and [0.00008, 0.00014]).
- Unit error fixed: 0.3856 pp / 0.4431 pp → 0.3856 (38.6 pp) / 0.4431 (44.3 pp).

Smaller revisions:
- Pipeline framing: "detecting" → "screening" in Abstract / Intro / Conclusion
  for consistency with the unsupervised-screening positioning.
- "hard ground-truth subset" → "conservative hard-positive subset" throughout.
- §III-F SSIM / pixel-comparison rebuttal compressed from ~15 lines to 4;
  design-level argument deferred to supplementary materials.
- "stakeholders can adopt / can derive thresholds" → "alternative operating
  points can be characterised by inverting" (less prescriptive).
- "the same mechanism extending in milder form to Firms B/C/D" → "similar,
  milder production-related reuse patterns at Firms B/C/D" (mechanism claim
  softened).
- Appendix A "non-hand-signed mode" / "two-mechanism mixture" lineage language
  aligned with v4 framing.

Appendix B:
- Rebuilt as a redirect-only stub. The HTML-commented obsolete table mapping
  (Table IX–XVIII labels with FAR / capture-rate / validation language) is
  removed; replaced with a short paragraph pointing to supplementary
  materials for full table-to-script provenance.

Cross-references:
- All §III-L references for the rule definition retargeted to §III-H.1;
  references for calibration still point to §III-L.
- §III-H references for byte-level Firm A evidence / non-Big-4 reverse anchor
  retargeted to §III-H.2.

Artefacts:
- Combined manuscript regenerated: paper_a_v4_combined.md, 1314 lines
  (was 1346 pre-review).
- Two review handoff documents added:
  paper/review_handoff_abstract_intro_20260515.md
  paper/review_handoff_body_20260515.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-15 18:07:31 +08:00
parent 12637cd413
commit b6913d2f93
13 changed files with 2267 additions and 227 deletions
+3 -3
View File
@@ -4,7 +4,7 @@ ARCHIVED. Not part of the IEEE Access submission.
IEEE Access Regular Papers do not include a separate Impact Statement
section. The text below is retained for possible reuse in a cover
letter, grant report, or non-IEEE venue. It is excluded from the
assembled paper by export_v3.py.
assembled paper by the manuscript export script.
If reused, note that the wording "distinguishes genuinely hand-signed
signatures from reproduced ones" overstates what a five-way confidence
@@ -17,5 +17,5 @@ external use.
Auditor signatures on financial reports are a key safeguard of corporate accountability.
When the signature on an audit report is produced by reproducing a stored image instead of by the partner's own hand---whether through an administrative stamping workflow or a firm-level electronic signing system---this safeguard is weakened, yet detecting the practice through manual inspection is infeasible at the scale of modern financial markets.
We developed a pipeline that automatically extracts and analyzes signatures from over 90,000 audit reports spanning a decade of filings by publicly listed companies in Taiwan.
Combining deep-learning visual features with perceptual hashing and two methodologically distinct threshold estimators (plus a density-smoothness diagnostic), the system stratifies signatures into a five-way confidence-graded classification and quantifies how the practice varies across firms and over time.
After further validation, the technology could support financial regulators in screening signature authenticity at national scale.
Combining deep-learning visual features with perceptual hashing, distributional diagnostics, and anchor-based inter-CPA coincidence-rate calibration, the system stratifies signatures into a five-way confidence-graded classification and quantifies how the practice varies across firms and over time.
With a future labelled evaluation set, the technology could support financial regulators in screening candidate non-hand-signed signatures at national scale.