Files

T

gbanyan 939a348da4 Add Paper A (IEEE TAI) complete draft with Firm A-calibrated dual-method classification

Paper draft includes all sections (Abstract through Conclusion), 36 references,
and supporting scripts. Key methodology: Cosine similarity + dHash dual-method
verification with thresholds calibrated against known-replication firm (Firm A).

Includes:
- 8 section markdown files (paper_a_*.md)
- Ablation study script (ResNet-50 vs VGG-16 vs EfficientNet-B0)
- Recalibrated classification script (84,386 PDFs, 5-tier system)
- Figure generation and Word export scripts
- Citation renumbering script ([1]-[36])
- Signature analysis pipeline (12 steps)
- YOLO extraction scripts

Three rounds of AI review completed (GPT-5.4, Claude Opus 4.6, Gemini 3 Pro).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-06 23:05:33 +08:00

11 KiB

Raw Permalink Blame History

Paper A: IEEE TAI Outline (Draft)

Target: IEEE Transactions on Artificial Intelligence (Regular Paper, ≤10 pages) Review: Double-blind Status: Outline — 待討論確認後再展開各 section

Title (候選)

"Automated Detection of Digitally Replicated Signatures in Large-Scale Financial Audit Reports"
"Are They Really Signing? A Deep Learning Pipeline for Detecting Signature Replication in 90K Audit Reports"
"Large-Scale Forensic Analysis of CPA Signature Authenticity Using Deep Features and Perceptual Hashing"

建議用 1 或 3，學術正式感較強。2 比較 catchy 但 TAI 可能偏保守。

Abstract (150-250 words)

要素：

Problem: 審計報告要求親簽，但實務上可能用數位複製（套印）
Gap: 目前無大規模自動化偵測方法
Method: VLM pre-screening → YOLO detection → ResNet-50 feature extraction → Cosine + pHash verification
Scale: 90,282 PDFs, 182,328 signatures, 758 CPAs, 2013-2023
Key finding: 以已知套印事務所作為校準，建立 distribution-free threshold
Contribution: first large-scale study, end-to-end pipeline, empirical threshold validation

Impact Statement (100-150 words)

方向（非專業人士看得懂）：

審計報告上的會計師簽名是財務報告可信度的重要保障。若簽名並非每次親簽，而是數位複製貼上，將影響審計品質與投資人保護。本研究開發了一套自動化 AI pipeline，分析了超過 9 萬份、橫跨 10 年的台灣上市公司審計報告，從中提取並比對 18 萬個簽名。透過深度學習特徵與感知雜湊的交叉驗證，我們能區分「風格一致的親簽」與「數位複製的套印」。研究發現部分會計事務所的簽名呈現統計上不可能由手寫產生的一致性。本方法可直接應用於金融監理機構的自動化稽核系統。

注意：投稿時寫英文版，這裡先用中文定調內容方向。

I. Introduction (~1.5 pages)

段落結構：

P1 — Problem context

審計報告簽名的法律意義（台灣法規要求親簽）
數位化後的漏洞：PDF 報告中的簽名容易被複製貼上
監理機構無法逐份人工檢查

P2 — Why this matters (motivation)

審計品質 → 投資人保護 → 資本市場信任
簽名真偽是審計獨立性的 proxy indicator
[REF: 審計品質相關文獻]

P3 — What exists (gap)

現有簽名驗證研究集中在 forgery detection（偽造偵測）
我們的問題不同：不是問「是不是本人簽的」，而是「是不是每次都親簽」
Replication detection ≠ Forgery detection
無大規模、真實財報的相關研究

P4 — What we do (contribution)

End-to-end pipeline: VLM → YOLO → ResNet → Cosine + pHash
Scale: 90K+ documents, 180K+ signatures, 10 years
Distribution-free threshold with known-replication calibration group
First study applying AI to audit signature authenticity at this scale

P5 — Paper organization

一句話帶過各 section

Contribution list (明確列出):

Pipeline: 完整的端到端自動化簽名真偽偵測系統
Scale: 迄今最大規模的審計報告簽名分析（90K PDFs, 180K signatures）
Methodology: 結合深度特徵（Cosine）與感知雜湊（pHash）的雙層驗證，解決「風格一致 vs 數位複製」的區分問題
Calibration: 利用已知套印事務所作為 ground truth 校準，建立 distribution-free 閾值

A. Offline Signature Verification

Siamese networks: Bromley et al. 1993, Dey et al. 2017 (SigNet)
CNN-based: Hadjadj et al. 2020 (single known sample)
Triplet Siamese: Mathematics 2024
Consensus threshold: arXiv:2401.03085
定位差異: 這些都是 forgery detection（驗真偽），我們是 replication detection（驗套印）

B. Document Forensics & Copy-Move Detection

Copy-move forgery detection survey (MTAP 2024)
Image forensics in scanned documents
定位差異: 通常針對圖片竄改，非針對簽名重複使用

C. VLM & Object Detection in Document Analysis

Vision-Language Models for document understanding
YOLO variants in document element detection
定位差異: 我們用 VLM + YOLO 作為 pipeline 前端，非核心貢獻但需說明

D. Perceptual Hashing for Image Comparison

pHash in near-duplicate detection
與 deep features 的互補性

III. Methodology (~3 pages)

從 methodology_draft_v1.md 精簡，聚焦在核心方法，省略實作細節

A. Pipeline Overview

Figure 1: 全流程圖（精簡版）
各階段一句話描述

B. Data Collection

90,282 PDFs from TWSE MOPS, 2013-2023
Table I: Dataset summary（精簡版）
CPA registry matching

C. Signature Detection

VLM pre-screening (Qwen2.5-VL): hit-and-stop strategy, 86,072 docs
YOLOv11n: 500 annotated → mAP50=0.99 → 182,328 signatures
Red stamp removal post-processing
省略: VLM prompt 全文、annotation protocol 細節、validation 細節 → 放 footnote 或略提

D. Feature Extraction

ResNet-50 (ImageNet1K_V2), no fine-tuning, 2048-dim, L2 normalized
Why no fine-tuning: similarity task, not classification; generalizability
CPA matching: 92.6% success rate

E. Dual-Method Verification (核心)

Cosine similarity: captures style-level similarity (high-level)
pHash distance: captures perceptual-level similarity (structural)
為什麼這個組合：
- Cosine 高 + pHash 低距離 = 強證據（數位複製）
- Cosine 高 + pHash 高距離 = 風格一致但非複製（親簽）
- 互補性解決了單一指標的歧義
SSIM 為何排除: 掃描雜訊敏感，已知套印的 SSIM 僅 0.70（footnote 帶過）

F. Threshold Selection

Distribution-free approach（非常態 → 百分位數）
KDE crossover = 0.838
Intra/Inter class distributions（Table + Figure）
Calibration via known-replication firm（key contribution）:
- Deloitte Taiwan: domain knowledge 確認全部套印
- Cosine mean = 0.980, 1st percentile = 0.908
- pHash ≤5: 58.75%
- 用作閾值校準的 anchor point

注意雙盲：不能寫 "Deloitte"，改用 "Firm A (a Big-4 firm known to use digital replication)"

IV. Experiments and Results (~2.5 pages)

A. Experimental Setup

Hardware/software environment
Evaluation metrics 定義

B. Signature Detection Performance

Table: YOLO metrics (Precision, Recall, mAP)
VLM-YOLO agreement rate: 98.8%

C. Distribution Analysis

Figure: Intra vs Inter cosine similarity distributions
Figure: pHash distance distributions (intra vs inter)
Table: Distributional statistics
Normality tests → justify percentile-based thresholds

D. Calibration Group Analysis (重點)

"Firm A" (已知套印) 的 Cosine/pHash 分布
vs 非四大的分布比較
KDE crossover (Firm A vs non-Big-4) = 0.969
Figure: Firm A distribution vs overall distribution
這是最有說服力的 section

E. Classification Results

Table: Overall verdict distribution (definite_copy / likely_copy / uncertain / genuine)
Cross-method agreement analysis
Key finding: Cosine-high ≠ pixel-identical
- 71,656 PDFs with Cosine > 0.95
- 只有 3.4% 同時 SSIM > 0.95
- 只有 0.4% pixel-identical

F. Ablation Study (新增，增強 AI 貢獻)

Feature backbone comparison: ResNet-50 vs VGG-16 vs EfficientNet-B0
- 比較 intra/inter class separation (Cohen's d)
- 計算量 vs 判別力 trade-off
Single method vs dual method:
- Cosine only vs pHash only vs Cosine + pHash
- 用 Firm A 作為 positive set，計算 precision/recall
Threshold sensitivity:
- 不同 cosine threshold 下的分類結果變化
- ROC-like curve（以 Firm A 為 positive）

V. Discussion (~1 page)

A. Replication vs Forgery: A Distinction That Matters

我們的問題本質上更簡單也更直接
不需要考慮仿冒者的存在
Physical impossibility argument: 同一人每次親簽不可能像素相同

B. The Gap Between Style Similarity and Digital Replication

81.4% likely_copy (Cosine) vs 2.8% definite_copy (pixel-level)
解讀：多數 CPA 簽名風格高度一致，但非數位複製
可能原因：使用簽名板、固定簽署環境
Policy implication: 僅靠 Cosine 會嚴重高估套印率

C. The Value of a Known-Replication Calibration Group

有 ground truth anchor 對閾值校準的重要性
可推廣到其他 document forensics 問題

D. Limitations

精簡版 limitations（3-4 點）
No labeled ground truth for full dataset
Feature extractor not fine-tuned
Scan quality variation over 10 years
Regulatory/legal definition of "replication" varies

VI. Conclusion and Future Work (~0.5 page)

Conclusion

總結 pipeline、規模、key findings
強調 dual-method 的必要性（Cosine alone 不夠）
Calibration group 的方法論貢獻

Future Work

Fine-tuned signature-specific feature extractor
Temporal analysis (year-over-year trends)
Cross-country generalization
Integration with regulatory monitoring systems
Small-scale ground truth validation (100-200 PDFs)

Figures & Tables Budget (10 頁限制下的分配)

#	Type	Content	Est. space
Fig 1	Pipeline	全流程圖	1/3 page
Fig 2	Distribution	Intra vs Inter cosine KDE	1/3 page
Fig 3	Distribution	pHash distance intra vs inter	1/4 page
Fig 4	Calibration	Firm A vs overall distribution	1/3 page
Fig 5	Ablation	Backbone comparison / threshold sensitivity	1/3 page
Table I	Data	Dataset summary	1/4 page
Table II	Detection	YOLO performance	1/6 page
Table III	Statistics	Distribution stats + tests	1/4 page
Table IV	Results	Classification verdicts	1/4 page
Table V	Ablation	Feature backbone comparison	1/4 page

Total figures/tables: ~3 pages → Text: ~7 pages → Feasible for 10-page limit

待辦 Checklist

需要新增的分析（Ablation Study）

ResNet-50 vs VGG-16 vs EfficientNet-B0 feature comparison
Single method vs dual method precision/recall (with Firm A as positive set)
Threshold sensitivity curve

需要整理的圖表

Fig 1: Pipeline diagram (clean vector version)
Fig 4: Firm A calibration distribution (新圖)
Fig 5: Ablation results (新圖)
所有圖表英文化

寫作

Impact Statement (英文版)
Abstract (英文版)
Introduction
Related Work — 需要補充文獻搜索
Methodology (從 v1 精簡)
Results (新寫)
Discussion (新寫)
Conclusion

投稿準備

匿名化（Deloitte → Firm A，移除所有可辨識資訊）
IEEE LaTeX template
Reference 格式化（IEEE numbered style）
相似度指數 < 20%

11 KiB Raw Permalink Blame History Unescape Escape