Files

T

gbanyan 5717d61dd4 Paper A v3.3: apply codex v3.2 peer-review fixes

Codex (gpt-5.4) second-round review recommended 'minor revision'. This
commit addresses all issues flagged in that review.

## Structural fixes

- dHash calibration inconsistency (codex #1, most important):
  Clarified in Section III-L that the <=5 and <=15 dHash cutoffs come
  from the whole-sample Firm A cosine-conditional dHash distribution
  (median=5, P95=15), not from the calibration-fold independent-minimum
  dHash distribution (median=2, P95=9) which we report elsewhere as
  descriptive anchors. Added explicit note about the two dHash
  conventions and their relationship.

- Section IV-H framing (codex #2):
  Renamed "Firm A Benchmark Validation: Threshold-Independent Evidence"
  to "Additional Firm A Benchmark Validation" and clarified in the
  section intro that H.1 uses a fixed 0.95 cutoff, H.2 is fully
  threshold-free, H.3 uses the calibrated classifier. H.3's concluding
  sentence now says "the substantive evidence lies in the cross-firm
  gap" rather than claiming the test is threshold-free.

- Table XVI 93,979 typo fixed (codex #3):
  Corrected to 84,354 total (83,970 same-firm + 384 mixed-firm).

- Held-out Firm A denominator 124+54=178 vs 180 (codex #4):
  Added explicit note that 2 CPAs were excluded due to disambiguation
  ties in the CPA registry.

- Table VIII duplication (codex #5):
  Removed the duplicate accountant-level-only Table VIII comment; the
  comprehensive cross-level Table VIII subsumes it. Text now says
  "accountant-level rows of Table VIII (below)".

- Anonymization broken in Tables XIV-XVI (codex #6):
  Replaced "Deloitte"/"KPMG"/"PwC"/"EY" with "Firm A"/"Firm B"/"Firm C"/
  "Firm D" across Tables XIV, XV, XVI. Table and caption language
  updated accordingly.

- Table X unit mismatch (codex #7):
  Dropped precision, recall, F1 columns. Table now reports FAR
  (against the inter-CPA negative anchor) with Wilson 95% CIs and FRR
  (against the byte-identical positive anchor). III-K and IV-G.1 text
  updated to justify the change.

## Sentence-level fixes

- "three independent statistical methods" in Methodology III-A ->
  "three methodologically distinct statistical methods".
- "three independent methods" in Conclusion -> "three methodologically
  distinct methods".
- Abstract "~0.006 converging" now explicitly acknowledges that
  BD/McCrary produces no significant accountant-level discontinuity.
- Conclusion ditto.
- Discussion limitation sentence "BD/McCrary should be interpreted at
  the accountant level for threshold-setting purposes" rewritten to
  reflect v3.3 result that BD/McCrary is a diagnostic, not a threshold
  estimator, at the accountant level.
- III-H "two analyses" -> "three analyses" (H.1 longitudinal stability,
  H.2 partner ranking, H.3 intra-report consistency).
- Related Work White 1982 overclaim rewritten: "consistent estimators
  of the pseudo-true parameter that minimizes KL divergence" replaces
  "guarantees asymptotic recovery".
- III-J "behavior is close to discrete" -> "practice is clustered".
- IV-D.2 pivot sentence "discreteness of individual behavior yields
  bimodality" -> "aggregation over signatures reveals clustered (though
  not sharply discrete) patterns".

Target journal remains IEEE Access. Output:
Paper_A_IEEE_Access_Draft_v3.docx (395 KB).

Codex v3.2 review saved to paper/codex_review_gpt54_v3_2.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-21 02:32:17 +08:00

2.8 KiB

Raw Blame History

Abstract

Regulations in many jurisdictions require Certified Public Accountants (CPAs) to attest to each audit report they certify, typically by affixing a signature or seal. However, the digitization of financial reporting makes it straightforward to reuse a stored signature image across multiple reports---whether by administrative stamping or firm-level electronic signing systems---potentially undermining the intent of individualized attestation. Unlike signature forgery, where an impostor imitates another person's handwriting, non-hand-signed reproduction involves the legitimate signer's own stored signature image being reproduced on each report, a practice that is visually invisible to report users and infeasible to audit at scale through manual inspection. We present an end-to-end AI pipeline that automatically detects non-hand-signed auditor signatures in financial audit reports. The pipeline integrates a Vision-Language Model for signature page identification, YOLOv11 for signature region detection, and ResNet-50 for deep feature extraction, followed by a dual-descriptor verification combining cosine similarity of deep embeddings with difference hashing (dHash). For threshold determination we apply three methodologically distinct methods---Kernel Density antimode with a Hartigan unimodality test, Burgstahler-Dichev/McCrary discontinuity, and EM-fitted Beta mixtures with a logit-Gaussian robustness check---at both the signature level and the accountant level. Applied to 90,282 audit reports filed in Taiwan over 2013--2023 (182,328 signatures from 758 CPAs) the methods reveal an informative asymmetry: signature-level similarity forms a continuous quality spectrum that no two-component mixture cleanly separates, while accountant-level aggregates are clustered into three recognizable groups (BIC-best K = 3) with the KDE antimode and the two mixture-based estimators converging within $\sim$0.006 of each other at cosine \approx 0.975; the Burgstahler-Dichev / McCrary test produces no significant discontinuity at the accountant level, consistent with clustered-but-smooth rather than sharply discrete accountant-level heterogeneity. A major Big-4 firm is used as a replication-dominated (not pure) calibration anchor, with interview and visual evidence supporting majority non-hand-signing and a minority of hand-signers; we break the circularity of using the same firm for calibration and validation by a 70/30 CPA-level held-out fold. Validation against 310 byte-identical positive signatures and a $\sim$50,000-pair inter-CPA negative anchor yields FAR \leq 0.001 with Wilson 95% confidence intervals at all accountant-level thresholds. To our knowledge, this represents the largest-scale forensic analysis of auditor signature authenticity reported in the literature.

2.8 KiB Raw Blame History

Abstract

2.8 KiB

Raw Blame History