Paper A v3: full rewrite for IEEE Access with three-method convergence

Major changes from v2: Terminology: - "digitally replicated" -> "non-hand-signed" throughout (per partner v3 feedback and to avoid implicit accusation) - "Firm A near-universal non-hand-signing" -> "replication-dominated" (per interview nuance: most but not all Firm A partners use replication) Target journal: IEEE TAI -> IEEE Access (per NCKU CSIE list) New methodological sections (III.G-III.L + IV.D-IV.G): - Three convergent threshold methods (KDE antimode + Hartigan dip test / Burgstahler-Dichev McCrary / EM-fitted Beta mixture + logit-GMM robustness check) - Explicit unit-of-analysis discussion (signature vs accountant) - Accountant-level 2D Gaussian mixture (BIC-best K=3 found empirically) - Pixel-identity validation anchor (no manual annotation needed) - Low-similarity negative anchor + Firm A replication-dominated anchor New empirical findings integrated: - Firm A signature cosine UNIMODAL (dip p=0.17) - long left tail = minority hand-signers - Full-sample cosine MULTIMODAL but not cleanly bimodal (BIC prefers 3-comp mixture) - signature-level is continuous quality spectrum - Accountant-level mixture trimodal (C1 Deloitte-heavy 139/141, C2 other Big-4, C3 smaller firms). 2-comp crossings cos=0.945, dh=8.10 - Pixel-identity anchor (310 pairs) gives perfect recall at all cosine thresholds - Firm A anchor rates: cos>0.95=92.5%, dual-rule cos>0.95 AND dh<=8=89.95% New discussion section V.B: "Continuous-quality spectrum vs discrete- behavior regimes" - the core interpretive contribution of v3. References added: Hartigan & Hartigan 1985, Burgstahler & Dichev 1997, McCrary 2008, Dempster-Laird-Rubin 1977, White 1982 (refs 37-41). export_v3.py builds Paper_A_IEEE_Access_Draft_v3.docx (462 KB, +40% vs v2 from expanded methodology + results sections). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 00:14:47 +08:00
parent 68689c9f9b
commit 9b11f03548
11 changed files with 1148 additions and 0 deletions
@@ -0,0 +1,104 @@
+# II. Related Work
+
+## A. Offline Signature Verification
+
+Offline signature verification---determining whether a static signature image is genuine or forged---has been studied extensively using deep learning.
+Bromley et al. [3] introduced the Siamese neural network architecture for signature verification, establishing the pairwise comparison paradigm that remains dominant.
+Hafemann et al. [14] demonstrated that deep CNN features learned from signature images provide strong discriminative representations for writer-independent verification, establishing the foundational baseline for subsequent work.
+Dey et al. [4] proposed SigNet, a convolutional Siamese network for writer-independent offline verification, extending this paradigm to generalize across signers without per-writer retraining.
+Hadjadj et al. [5] addressed the practical constraint of limited reference samples, achieving competitive verification accuracy using only a single known genuine signature per writer.
+More recently, Li et al. [6] introduced TransOSV, the first Vision Transformer-based approach, achieving state-of-the-art results.
+Tehsin et al. [7] evaluated distance metrics for triplet Siamese networks, finding that Manhattan distance outperformed cosine and Euclidean alternatives.
+Zois et al. [15] proposed similarity distance learning on SPD manifolds for writer-independent verification, achieving robust cross-dataset transfer.
+Hafemann et al. [16] further addressed the practical challenge of adapting to new users through meta-learning, reducing the enrollment burden for signature verification systems.
+
+A common thread in this literature is the assumption that the primary threat is *identity fraud*: a forger attempting to produce a convincing imitation of another person's signature.
+Our work addresses a fundamentally different problem---detecting whether the *legitimate signer's* stored signature image has been reproduced across many documents---which requires analyzing the upper tail of the intra-signer similarity distribution rather than modeling inter-signer discriminability.
+
+Brimoh and Olisah [8] proposed a consensus-threshold approach that derives classification boundaries from known genuine reference pairs, the methodology most closely related to our calibration strategy.
+However, their method operates on standard verification benchmarks with laboratory-collected signatures, whereas our approach applies threshold calibration using a replication-dominated subpopulation identified through domain expertise in real-world regulatory documents.
+
+## B. Document Forensics and Copy Detection
+
+Image forensics encompasses a broad range of techniques for detecting manipulated visual content [17], with recent surveys highlighting the growing role of deep learning in forgery detection [18].
+Copy-move forgery detection (CMFD) identifies duplicated regions within or across images, typically targeting manipulated photographs [11].
+Abramova and Böhme [10] adapted block-based CMFD to scanned text documents, noting that standard methods perform poorly in this domain because legitimate character repetitions produce high similarity scores that confound duplicate detection.
+
+Woodruff et al. [9] developed the work most closely related to ours: a fully automated pipeline for extracting and analyzing signatures from corporate filings in the context of anti-money-laundering investigations.
+Their system uses connected component analysis for signature detection, GANs for noise removal, and Siamese networks for author clustering.
+While their pipeline shares our goal of large-scale automated signature analysis on real regulatory documents, their objective---grouping signatures by authorship---differs fundamentally from ours, which is detecting image-level reproduction within a single author's signatures across documents.
+
+In the domain of image copy detection, Pizzi et al. [13] proposed SSCD, a self-supervised descriptor using ResNet-50 with contrastive learning for large-scale copy detection on natural images.
+Their work demonstrates that pre-trained CNN features with cosine similarity provide a strong baseline for identifying near-duplicate images, a finding that supports our feature-extraction approach.
+
+## C. Perceptual Hashing
+
+Perceptual hashing algorithms generate compact fingerprints that are robust to minor image transformations while remaining sensitive to substantive content changes [19].
+Unlike cryptographic hashes, which change entirely with any pixel modification, perceptual hashes produce similar outputs for visually similar inputs, making them suitable for near-duplicate detection in scanned documents where minor variations arise from the scanning process.
+
+Jakhar and Borah [12] demonstrated that combining perceptual hashing with deep learning features significantly outperforms either approach alone for near-duplicate image detection, achieving AUROC of 0.99 on standard benchmarks.
+Their two-stage architecture---pHash for fast structural comparison followed by deep features for semantic verification---provides methodological precedent for our dual-descriptor approach, though applied to natural images rather than document signatures.
+
+Our work differs from prior perceptual-hashing studies in its application context and in the specific challenge it addresses: distinguishing legitimate high visual consistency (a careful signer producing similar-looking signatures) from image-level reproduction in scanned financial documents.
+
+## D. Deep Feature Extraction for Signature Analysis
+
+Several studies have explored pre-trained CNN features for signature comparison without metric learning or Siamese architectures.
+Engin et al. [20] used ResNet-50 features with cosine similarity for offline signature verification on real-world scanned documents, incorporating CycleGAN-based stamp removal as preprocessing---a pipeline design closely paralleling our approach.
+Tsourounis et al. [21] demonstrated successful transfer from handwritten text recognition to signature verification, showing that CNN features trained on related but distinct handwriting tasks generalize effectively to signature comparison.
+Chamakh and Bounouh [22] confirmed that a simple ResNet backbone with cosine similarity achieves competitive verification accuracy across multilingual signature datasets without fine-tuning, supporting the viability of our off-the-shelf feature-extraction approach.
+
+Babenko et al. [23] established that CNN-extracted neural codes with cosine similarity provide an effective framework for image retrieval and matching, a finding that underpins our feature-comparison approach.
+These findings collectively suggest that pre-trained CNN features, when L2-normalized and compared via cosine similarity, provide a robust and computationally efficient representation for signature comparison---particularly suitable for large-scale applications where the computational overhead of Siamese training or metric learning is impractical.
+
+## E. Statistical Methods for Threshold Determination
+
+Our threshold-determination framework combines three families of methods developed in statistics and accounting-econometrics.
+
+*Non-parametric density estimation.*
+Kernel density estimation [28] provides a smooth estimate of a similarity distribution without parametric assumptions.
+Where the distribution is bimodal, the local density minimum (antimode) between the two modes is the Bayes-optimal decision boundary under equal priors.
+The statistical validity of the bimodality itself can be tested independently via the Hartigan & Hartigan dip test [37], which we use as a formal bimodality diagnostic.
+
+*Discontinuity tests on empirical distributions.*
+Burgstahler and Dichev [38], working in the accounting-disclosure literature, proposed a test for smoothness violations in empirical frequency distributions.
+Under the null that the distribution is generated by a single smooth process, the expected count in any histogram bin equals the average of its two neighbours, and the standardized deviation from this expectation is approximately $N(0,1)$.
+The test was placed on rigorous asymptotic footing by McCrary [39], whose density-discontinuity test provides full asymptotic distribution theory, bandwidth-selection rules, and power analysis.
+The BD/McCrary pairing is well suited to detecting the boundary between two generative mechanisms (non-hand-signed vs. hand-signed) under minimal distributional assumptions.
+
+*Finite mixture models.*
+When the empirical distribution is viewed as a weighted sum of two (or more) latent component distributions, the Expectation-Maximization algorithm [40] provides consistent maximum-likelihood estimates of the component parameters.
+For observations bounded on $[0,1]$---such as cosine similarity and normalized Hamming-based dHash similarity---the Beta distribution is the natural parametric choice, with applications spanning bioinformatics and Bayesian estimation.
+Under mild regularity conditions, White's quasi-MLE consistency result [41] guarantees asymptotic recovery of the best Beta-family approximation to the true distribution, even when the true distribution is not exactly Beta, provided the model is correctly specified in the broader exponential-family sense.
+
+The present study combines all three families, using each to produce an independent threshold estimate and treating cross-method convergence---or principled divergence---as evidence of where in the analysis hierarchy the mixture structure is statistically supported.
+<!--
+REFERENCES for Related Work (see paper_a_references_v3.md for full list):
+[3]  Bromley et al. 1993 — Siamese TDNN (NeurIPS)
+[4]  Dey et al. 2017 — SigNet
+[5]  Hadjadj et al. 2020 — Single sample SV
+[6]  Li et al. 2024 — TransOSV
+[7]  Tehsin et al. 2024 — Triplet Siamese
+[8]  Brimoh & Olisah 2024 — Consensus threshold
+[9]  Woodruff et al. 2021 — AML signature pipeline
+[10] Abramova & Böhme 2016 — CMFD in scanned docs
+[11] Copy-move forgery detection survey — MTAP 2024
+[12] Jakhar & Borah 2025 — pHash + DL
+[13] Pizzi et al. 2022 — SSCD
+[14] Hafemann et al. 2017 — CNN features for SV
+[15] Zois et al. 2024 — SPD manifold SV
+[16] Hafemann et al. 2019 — Meta-learning for SV
+[17] Farid 2009 — Image forgery detection survey
+[18] Mehrjardi et al. 2023 — DL-based image forgery detection survey
+[19] Luo et al. 2025 — Perceptual hashing survey
+[20] Engin et al. 2020 — ResNet + cosine on real docs
+[21] Tsourounis et al. 2022 — Transfer from text to signatures
+[22] Chamakh & Bounouh 2025 — ResNet18 unified SV
+[23] Babenko et al. 2014 — Neural codes for image retrieval
+[28] Silverman 1986 — Density estimation
+[37] Hartigan & Hartigan 1985 — dip test of unimodality
+[38] Burgstahler & Dichev 1997 — earnings management discontinuity
+[39] McCrary 2008 — density discontinuity test
+[40] Dempster, Laird & Rubin 1977 — EM algorithm
+[41] White 1982 — quasi-MLE consistency
+-->