pdf_signature_extraction

gbanyan/pdf_signature_extraction

Fork 0

Commit Graph

Author	SHA1	Message	Date
gbanyan	9392f30aef	Add script 41: §IV-K full-dataset robustness comparison (Light) Light §IV-K secondary analysis per v4.0 author choice (codex round-22 open question 1). Reruns the K=3 mixture + Paper A operational-rule per-CPA hand_frac on the full accountant dataset (n = 686) and compares to the Big-4 primary scope (n = 437). Results: Component drift Big-4 -> Full: C1 hand-leaning \|dcos\| = 0.018, \|ddh\| = 2.0, \|dwt\| = 0.14 C2 mixed \|dcos\| = 0.002, \|ddh\| = 0.3, \|dwt\| = 0.02 C3 replicated \|dcos\| = 0.000, \|ddh\| = 0.0, \|dwt\| = 0.12 Spearman rho (P_C1 vs paperA_hand_frac): Big-4: +0.9627 Full dataset: +0.9558 \|drift\| = 0.0069 Reading: K=3 component ordering and Spearman convergence are preserved at full scope, supporting the v4.0 reproducibility claim. Component locations and weights shift modestly because mid/small-firm composition broadens C1 (hand-leaning) and reduces C3 weight; this is expected since mid/small firms include hand-leaning CPAs that the Big-4-primary scope deliberately excludes. Crossings and component locations are NOT operationally interchangeable between scopes; §IV-K reports them only as a robustness cross-check. The five-way moderate-confidence band is NOT re-evaluated here (Light scope); §IV-J flags it as inherited from v3.x calibration without v4-specific recalibration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 16:32:39 +08:00

Author

SHA1

Message

Date

gbanyan

9392f30aef

Add script 41: §IV-K full-dataset robustness comparison (Light)

Light §IV-K secondary analysis per v4.0 author choice (codex
round-22 open question 1). Reruns the K=3 mixture + Paper A
operational-rule per-CPA hand_frac on the full accountant dataset
(n = 686) and compares to the Big-4 primary scope (n = 437).

Results:

  Component drift Big-4 -> Full:
    C1 hand-leaning  |dcos| = 0.018, |ddh| = 2.0, |dwt| = 0.14
    C2 mixed         |dcos| = 0.002, |ddh| = 0.3, |dwt| = 0.02
    C3 replicated    |dcos| = 0.000, |ddh| = 0.0, |dwt| = 0.12

  Spearman rho (P_C1 vs paperA_hand_frac):
    Big-4:        +0.9627
    Full dataset: +0.9558
    |drift| = 0.0069

Reading: K=3 component ordering and Spearman convergence are
preserved at full scope, supporting the v4.0 reproducibility
claim. Component locations and weights shift modestly because
mid/small-firm composition broadens C1 (hand-leaning) and reduces
C3 weight; this is expected since mid/small firms include
hand-leaning CPAs that the Big-4-primary scope deliberately
excludes. Crossings and component locations are NOT operationally
interchangeable between scopes; §IV-K reports them only as a
robustness cross-check.

The five-way moderate-confidence band is NOT re-evaluated here
(Light scope); §IV-J flags it as inherited from v3.x calibration
without v4-specific recalibration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-12 16:32:39 +08:00

1 Commits