939a348da4
Paper draft includes all sections (Abstract through Conclusion), 36 references, and supporting scripts. Key methodology: Cosine similarity + dHash dual-method verification with thresholds calibrated against known-replication firm (Firm A). Includes: - 8 section markdown files (paper_a_*.md) - Ablation study script (ResNet-50 vs VGG-16 vs EfficientNet-B0) - Recalibrated classification script (84,386 PDFs, 5-tier system) - Figure generation and Word export scripts - Citation renumbering script ([1]-[36]) - Signature analysis pipeline (12 steps) - YOLO extraction scripts Three rounds of AI review completed (GPT-5.4, Claude Opus 4.6, Gemini 3 Pro). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
22 lines
2.4 KiB
Markdown
22 lines
2.4 KiB
Markdown
# VI. Conclusion and Future Work
|
|
|
|
## Conclusion
|
|
|
|
We have presented an end-to-end AI pipeline for detecting digitally replicated signatures in financial audit reports at scale.
|
|
Applied to 90,282 audit reports from Taiwanese publicly listed companies spanning 2013--2023, our system extracted and analyzed 182,328 CPA signatures using a combination of VLM-based page identification, YOLO-based signature detection, deep feature extraction, and dual-method similarity verification.
|
|
|
|
Our key findings are threefold.
|
|
First, we argued that signature replication detection is a distinct problem from signature forgery detection, requiring different analytical tools focused on intra-signer similarity distributions.
|
|
Second, we showed that combining cosine similarity of deep features with difference hashing is essential for meaningful classification---among 71,656 documents with high feature-level similarity, the structural verification layer revealed that only 41% exhibit converging replication evidence, while 7% show no structural corroboration despite near-identical features, demonstrating that a single-metric approach conflates style consistency with digital duplication.
|
|
Third, we introduced a calibration methodology using a known-replication reference group whose distributional characteristics (dHash median = 5, 95th percentile = 15) directly informed the classification thresholds, achieving 96.9% capture of the calibration group.
|
|
|
|
An ablation study comparing three feature extraction backbones (ResNet-50, VGG-16, EfficientNet-B0) confirmed that ResNet-50 offers the best balance of discriminative power, classification stability, and computational efficiency for this task.
|
|
|
|
## Future Work
|
|
|
|
Several directions merit further investigation.
|
|
Domain-adapted feature extractors, trained or fine-tuned on signature-specific datasets, may improve discriminative performance beyond the transferred ImageNet features used in this study.
|
|
Temporal analysis of signature similarity trends---tracking how individual CPAs' similarity profiles evolve over years---could reveal transitions between genuine signing and digital replication practices.
|
|
The pipeline's applicability to other jurisdictions and document types (e.g., corporate filings in other countries, legal documents, medical records) warrants exploration.
|
|
Finally, integration with regulatory monitoring systems and small-scale ground truth validation through expert review would strengthen the practical deployment potential of this approach.
|