Abstract trim: 253 -> 245 words (within IEEE Access 250-word target)

Six minor edits to reduce word count: - 'a YOLOv11 detector localizes signatures' -> 'YOLOv11 localizes signatures' - 'filed in Taiwan over 2013-2023' -> 'Taiwan audit reports (2013-2023)' - 'statistical analysis is scoped to the Big-4 sub-corpus (437 CPAs, 150,442 signatures)' -> 'analysis is scoped to the Big-4 sub-corpus (437 CPAs; 150,442 signatures)' - 'Wilson 95% upper bound 1.45%' -> 'Wilson upper bound 1.45%' - 'cross-scope check (n = 686) preserves the K=3 + box-rule Spearman convergence with drift 0.007' -> 'check (n = 686) preserves the K=3 + box-rule Spearman convergence (drift 0.007)' All numerical anchors preserved. Phase 4 prose v2 now within IEEE Access 250-word abstract limit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> EOF
2026-05-12 23:57:01 +08:00
parent 10c82fd446
commit 918d55154a
1 changed files with 1 additions and 1 deletions
@@ -8,7 +8,7 @@
 > *IEEE Access target: <= 250 words, single paragraph.*
-Regulations require Certified Public Accountants (CPAs) to attest each audit report with a signature, but digitization makes reusing a stored signature image across reports — through administrative stamping or firm-level electronic signing — technically trivial and visually invisible, undermining individualized attestation. We build an end-to-end pipeline detecting such *non-hand-signed* signatures at scale: a Vision-Language Model identifies signature pages, a YOLOv11 detector localizes signatures, ResNet-50 supplies deep features, and a dual-descriptor layer combines cosine similarity with an independent-minimum perceptual hash (dHash) to separate *style consistency* from *image reproduction*. The operational output is an inherited five-way per-signature classifier with worst-case document-level aggregation. Applied to 90,282 audit reports filed in Taiwan over 2013–2023, the pipeline yields 182,328 signatures from 758 CPAs. Our primary statistical analysis is scoped to the Big-4 sub-corpus (437 CPAs, 150,442 signatures), where the accountant-level $(\overline{\text{cos}}, \overline{\text{dHash}})$ distribution exhibits dip-test multimodality ($p < 5 \times 10^{-4}$, both axes) — the smallest scope tested at which this holds. A 3-component mixture recovers a stable cross-firm hand-leaning component (weight 0.143; leave-one-firm-out drift $\leq 0.005$ cos / 0.96 dh / 0.023 weight) alongside a Firm-A-concentrated templated component. Three feature-derived scores — mixture posterior, reverse-anchor cosine percentile against a non-Big-4 reference, and the inherited box-rule hand-leaning rate; not statistically independent — agree on the per-CPA ranking at Spearman $\rho \geq 0.879$. Against 262 byte-identical Big-4 signatures, all three achieve 0% positive-anchor miss rate (Wilson 95% upper bound 1.45%). A light full-dataset cross-scope check ($n = 686$) preserves the K=3 + box-rule Spearman convergence with drift 0.007.
+Regulations require Certified Public Accountants (CPAs) to attest each audit report with a signature, but digitization makes reusing a stored signature image across reports — through administrative stamping or firm-level electronic signing — technically trivial and visually invisible, undermining individualized attestation. We build an end-to-end pipeline detecting such *non-hand-signed* signatures at scale: a Vision-Language Model identifies signature pages, YOLOv11 localizes signatures, ResNet-50 supplies deep features, and a dual-descriptor layer combines cosine similarity with an independent-minimum perceptual hash (dHash) to separate *style consistency* from *image reproduction*. The operational output is an inherited five-way per-signature classifier with worst-case document-level aggregation. Applied to 90,282 Taiwan audit reports (2013–2023), the pipeline yields 182,328 signatures from 758 CPAs. Our primary analysis is scoped to the Big-4 sub-corpus (437 CPAs; 150,442 signatures), where the accountant-level $(\overline{\text{cos}}, \overline{\text{dHash}})$ distribution exhibits dip-test multimodality ($p < 5 \times 10^{-4}$, both axes) — the smallest scope tested at which this holds. A 3-component mixture recovers a stable cross-firm hand-leaning component (weight 0.143; leave-one-firm-out drift $\leq 0.005$ cos / 0.96 dh / 0.023 weight) alongside a Firm-A-concentrated templated component. Three feature-derived scores — mixture posterior, reverse-anchor cosine percentile against a non-Big-4 reference, and the inherited box-rule hand-leaning rate; not statistically independent — agree on the per-CPA ranking at Spearman $\rho \geq 0.879$. Against 262 byte-identical Big-4 signatures, all three achieve 0% positive-anchor miss rate (Wilson upper bound 1.45%). A light full-dataset check ($n = 686$) preserves the K=3 + box-rule Spearman convergence (drift 0.007).
 ---