pdf_signature_extraction

8 Commits

Include renames

Author	SHA1	Message	Date
gbanyanandClaude Opus 4.7	6ba128ded4	Apply codex round-25 final polish: §III v6 + §IV v3.2 Codex round 25 returned Minor Revision: round-24's empirical and cross-reference issues mostly CLOSED. Remaining items were all partner-facing cosmetic / internal-notes hygiene. §III v6 polish: 1. §III:11 v5 changelog reprint of real firm names removed ("real firm names 'EY' and 'KPMG'" -> "real firm names/aliases") -- this was a self-regression I introduced in v5 while documenting the v5 anonymisation fix. 2. §III:14 empirical anchor range updated: "Scripts 32-40" -> "Scripts 32-42" (includes Scripts 41 + 42). 3. New v6 changelog entry added under the draft note documenting the round-25 fixes. 4. Draft note version stamp refreshed: v5 -> v6. §IV v3.2 polish: 1. §IV draft note rewritten and version label corrected: "Draft v3" -> "Draft v3.2"; "post codex rounds 21-23" -> "post codex rounds 21-25". The v3 -> v3.1 -> v3.2 lineage is now recorded. 2. §IV close-out checklist item 2 rewritten to remove residual "Tables IV-XVIII" wording. v3.2 explicitly states: v4 table sequence is Tables V-XVIII plus Table XV-B; no v4 Table IV is printed; the inherited v3.20.0 Table IV (per-firm detection counts) remains a v3.x reference only. Verification: - Strict-case grep for KPMG / Deloitte / PwC / EY (with word boundaries) + Chinese firm names: ZERO matches in either file. Anonymisation is now complete throughout the manuscript body AND internal notes. Round 25 closure post-polish: Major: all CLOSED (round 24 Major 1 table numbering: now fully explicit V-XVIII + XV-B with v4 Table IV absent; Major 4 anonymisation: §III:11 leak removed) Minor: all CLOSED (weight drift 0.023 confirmed across 4 sites; cos <= 0.837 confirmed across 2 sites; n=686 provenance row confirmed) Editorial: 1 still PARTIAL (internal draft notes + Phase 3 close-out checklist remain in the files but explicitly marked "internal -- remove before submission"; these are author working artefacts intentionally retained until submission packaging) Phase 4 readiness: technically Yes; the §III/§IV technical content is converged across 5 codex review rounds. Internal notes will be stripped at submission packaging time. Ready to proceed to Phase 4 (Abstract/Intro/Discussion/Conclusion prose). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 22:36:16 +08:00
gbanyanandClaude Opus 4.7	6d2eddb6e8	Apply codex round-24 final cleanup: §III v5 + §IV v3.1 Codex round 24 returned Minor Revision: 3 Major CLOSED + 3 Major PARTIAL + 4 Minor CLOSED + 2 Minor PARTIAL + 4 Editorial CLOSED + 1 Editorial OPEN. All 7 narrow residual fixes were §III-side (I applied §IV fixes thoroughly in v3 but didn't mirror them to §III v4). §III v5 fixes: 1. Anonymisation leak repaired: - "held-out-EY fold" -> "held-out-Firm-D fold" (L71) - "Firms B (KPMG) and D (EY)" -> "Firms B and D" (L99) 2. K=3 LOOO weight drift 0.025 -> 0.023 at three sites (L71, L115, L173 provenance table). Matches Script 37 max C1 weight deviation and §IV v3 line 139. 3. §III-K positive-anchor paragraph cross-ref repaired: "v3.x inter-CPA negative anchor (§III-J inherited; Table X)" -> "(§IV-I, inheriting v3.20.0 §IV-F.1 Table X)". 4. §III-L five-way Likely-hand-signed band made inclusive: "Cosine below the all-pairs KDE crossover threshold." -> "Cosine at or below the all-pairs KDE crossover threshold (cos <= 0.837)." Matches Script 42 and §IV:19. 5. Open question 1's pointer changed from current §IV-F (which is Convergent Internal-Consistency Checks) to v3.20.0 Tables IX/XI/XII/XII-B + §IV-J descriptive proportions. 6. Provenance table: new row for full-dataset n=686 citing Script 41 fulldataset_report.md. 7. Draft-note header refreshed: v3 -> v5; v4 -> v5 etc.; "internal -- remove before submission" tag added. §IV v3.1 fixes: - Close-out checklist L262 stale "codex round 23" wording updated to "rounds 21-24 and before partner Jimmy review". - Close-out item 4 "in this v2" stale wording -> "in this v3". - New item 5 added: internal author notes (this checklist + §III cross-reference index + both files' draft-note headers) are author working artefacts and should be moved/stripped before partner / submission packaging. Round 24 finding summary post-v5/v3.1: Major: 3 CLOSED, 3 -> CLOSED (anonymisation + cross-ref + table numbering note residuals) Minor: 4 CLOSED, 2 -> CLOSED (weight drift 0.025 -> 0.023; low-cosine inclusivity cos <= 0.837) Editorial: 4 CLOSED, 1 PARTIAL (draft notes remain visible but explicitly marked as internal-only "remove before submission") Phase 4 readiness: pending decision on whether to do one more codex verification round (round 25) before drafting Abstract / Intro / Discussion / Conclusion prose. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 22:26:14 +08:00
gbanyanandClaude Opus 4.7	ce33156238	Apply codex round-23 corrections: §IV v3 + §III v4 Codex round 23 returned Major Revision on §IV v2: 6 Major + 6 Minor + 5 Editorial findings. Codex confirmed the spike-script provenance is mostly sound -- no scripts needed rerunning -- so v3 applies presentation-level fixes only. Decisions baked in: - Anonymisation: maintain Firm A-D pseudonyms throughout the manuscript body; remove (Deloitte) / (KPMG) / (PwC) / (EY) parentheticals from all v4 §IV tables. - Table numbering: v4 tables use fresh V-XVIII (plus Table XV-B); inherited v3.x tables are cited only as "v3.20.0 Table N" with the original v3 number, NOT renumbered into the v4 sequence. §IV v3 changes: 1. Detection denominator rewritten: 86,072 VLM-positive / 12 corrupted / 86,071 YOLO-processed / 85,042 with-detections / 182,328 signatures (matches v3.x §IV-B exact wording). 2. All v4 table labels stripped of "(revised:" / "(NEW:" prefixes; replaced with clean "Table N. <descriptor>." form. 3. Real firm names removed from all tables: 4 replace_all edits. 4. Line 211 MC-ordering claim removed: MC occupancy is no longer described as "consistent with the §III-K Spearman convergence" because MC fraction is not monotone in per-CPA hand-leaning ranking. New language: descriptive only, with Firm D / Firm B ordering counterexample stated. 5. Line 184 81.70% vs 82.46% qualified as "qualitative alignment, not like-for-like consistency check" (different units: per-signature class vs per-CPA hard cluster). 6. Line 43 BD-transition "histogram-resolution artefacts" softened to "scope-dependent and not used operationally"; no specific bin-width artefact claim without sensitivity sweep evidence. 7. K=3 LOOO C1 weight drift corrected: 0.025 -> 0.023 (matches Script 37 max deviation 0.0235 / rounded 0.023). 8. Seed coverage in §IV-A updated: "Scripts 32-42" (was "Scripts 32-41", missed Script 42). 9. Low-cosine cutoff inclusivity: cos < 0.837 -> cos <= 0.837 (matches Script 42 rule definition). 10. "round-22 Light scope" process note removed from manuscript prose in §IV-K. 11. §IV-L ablation pointer corrected: v3.20.0 §IV-I (was §IV-H.3); v3.20.0 Table XVIII clarified as different from v4 Table XVIII. 12. Line 75 "Component recovery verified across Scripts 35, 37, 38" rewritten: "the full-fit baseline is reproduced in Scripts 35, 37, 38" with explicit note that Script 37 LOOO fold-specific components differ by design. 13. Line 110 grammar: "This convergent-checks evidence" -> "These convergence checks". 14. Draft note marked "internal -- remove before submission". §III v4 changes (cross-reference cleanup): 1. Line 13 cross-reference repaired: "§IV-D, §IV-F, §IV-G" (which are now accountant-level v4 analyses) replaced with accurate signature-level references (§IV-J for five-way counts; §IV-I for inherited inter-CPA FAR). 2. Line 23 cross-reference repaired: "all §IV results except §IV-K" replaced with explicit list of v4-new vs inherited sub-sections. 3. Line 109 cross-reference repaired: moderate-band capture- rate evidence cited as "v3.20.0 Tables IX, XI, XII, XII-B" (was "§IV-F", which is now Convergent Internal-Consistency Checks, not capture-rate). 4. Line 131 "without recalibration" claim narrowed: §III-K's convergent-checks evidence is now scoped to the binary high-confidence rule only; the moderate-confidence band, style-consistency band, and document-level aggregation are retained by reference to v3.20.0 calibration, not claimed as v4.0-validated. Outstanding open questions: 3 procedural items remain (§IV table numbering finalisation, §IV-A-C content audit, Phase 4 prose); no methodology blockers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 17:03:33 +08:00
gbanyanandClaude Opus 4.7	453f1d8768	Phase 3 close-out: Script 42 + §IV draft v2 (Table XV filled) Script 42 tabulates the §III-L five-way per-signature classifier output on the Big-4 sub-corpus (n=150,442 signatures classified) and aggregates to document-level (n=75,233 unique PDFs) under the worst-case rule. Per-signature five-way overall (Table XV): HC 74,593 49.58% high-confidence non-hand-signed MC 39,817 26.47% moderate-confidence non-hand-signed HSC 314 0.21% high style consistency UN 35,480 23.58% uncertain LH 238 0.16% likely hand-signed Per-firm five-way (% within firm): Firm A (Deloitte) HC 81.70%, MC 10.76%, UN 7.42% Firm B (KPMG) HC 34.56%, MC 35.88%, UN 29.09% Firm C (PwC) HC 23.75%, MC 41.44%, UN 34.21% Firm D (EY) HC 24.51%, MC 29.33%, UN 45.65% Document-level (Table XV-B, NEW): HC 46,857 62.28% MC 19,667 26.14% HSC 167 0.22% UN 8,524 11.33% LH 18 0.02% Total 75,233 unique Big-4 PDFs (single-firm 74,854; mixed-firm 379) §IV v2 changes vs v1: - Table XV populated with Script 42 counts - Table XV-B (NEW): document-level worst-case counts - Per-firm five-way breakdown (% within firm) added - Per-firm document-level breakdown added - Document-level paragraph in §IV-J updated to reference Table XV-B - Phase 3 close-out checklist: item 1 (Table XV TBD) and item 4 (document-level counts) marked RESOLVED; remaining items reduced from 5 to 3 (renumbering, content audit, codex open-questions) The per-firm pattern is consistent with the §III-K Spearman-and- cluster ordering: Firm A's signatures concentrate in HC (81.7%), the three non-Firm-A firms have markedly lower HC and substantially higher Uncertain rates (29-46%), with Firm D having the highest Uncertain rate of the Big-4 -- consistent with the reverse-anchor score (§III-K Score 2) ranking Firm D fractionally above Firm C in the hand-leaning direction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 16:45:22 +08:00
gbanyanandClaude Opus 4.7	165b3ab384	Add Phase 3 §IV draft v1 (Big-4 reframe + light §IV-K robustness) Section IV expands from 8 sub-sections in v3.20.0 to 12 sub-sections (A through L) to mirror the §III-G..L lineage. Sub-section structure: A Experimental Setup (inherited) B Signature Detection Performance (inherited) C All-Pairs Intra-vs-Inter Class Distribution (inherited; corpus-wide) D Big-4 Accountant-Level Distributional Characterisation (NEW) - Table V revised: Big-4 dip-test - Table VI revised: BD/McCrary diagnostic E Big-4 K=2 / K=3 Mixture Fits (NEW) - Table VII revised: K=2 components + bootstrap CIs - Table VIII revised: K=3 components F Convergent Internal-Consistency Checks (NEW) - Table IX revised: 3-score per-CPA Spearman - Table X revised: per-firm summary - Table XI revised: per-signature Cohen kappa G Leave-One-Firm-Out Reproducibility (NEW) - Table XII revised: K=2 LOOO across 4 folds - Table XIII revised: K=3 LOOO H Pixel-Identity Positive-Anchor Miss Rate - Table XIV revised: 0% miss rate, n=262 I Inter-CPA Negative-Anchor FAR (inherited from v3.x §IV-F.1) J Five-Way Per-Signature + Document-Level Classification - Table XV: per-signature category counts (TBD; close-out task) - Table XVI NEW: firm x K=3 cluster cross-tab K Full-Dataset Robustness (NEW; light scope per author choice) - Table XVII NEW: K=3 component comparison Big-4 vs full - Table XVIII NEW: Spearman drift \|0.0069\| L Feature Backbone Ablation (inherited from v3.x §IV-H.3) 5 close-out items flagged at end of draft: per-signature category counts on Big-4 subset (Table XV), table renumbering, §IV-A-C content audit, document-level worst-case aggregation counts on Big-4 subset, codex round-22 open questions resolved (moderate-band inherited; firm anonymisation maintained; table numbering set provisionally). Empirical anchors: Scripts 32-41 on this branch. Script 41 (committed in previous commit) supplies the §IV-K Light scope numbers; all other tables draw from Scripts 32-40 already on the branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 16:35:37 +08:00
gbanyanandClaude Opus 4.7	c8c7656513	Apply codex round-22 corrections to §III v3 (Minor -> ready) Codex gpt-5.5 round 22 returned Minor Revision after v2 closed 3/5 Major findings fully and 2/5 partially. Five narrow fixes applied for v3: 1. Per-firm ranking unanimity corrected (v2:93). The reverse- anchor score ranks Firm D fractionally higher than Firm C (-0.7125 vs -0.7672); only Scores 1 and 3 rank Firm C highest. The unanimity claim was wrong; v3 prose now says all three agree on Firm A as most replication-dominated and on the non-Firm-A Big-4 as more hand-leaning, with a modest disagreement on Firm C vs D ordering. 2. "Smallest scope" / "any single firm" overclaim narrowed (v2:21, v2:43). Script 32 only tested Firm A alone, big4_non_A pooled, and all_non_A pooled -- not Firms B, C, D individually. v3 explicitly says "comparison scopes tested in Script 32" and notes single-firm dip tests for B, C, D were not separately computed. 3. K=3 hard label vs posterior in Spearman correctly attributed (v2:143). Script 38 uses the K=3 posterior P(C1), not the hard label, in the internal-consistency Spearman correlations. v3 §III-L now correctly says the hard label is for the §IV cluster cross-tabulation while the posterior is the continuous Score 1 in §III-K. 4. Provenance source for n=150,442 corrected (v2:17, v2:152). Script 39 directly reports this count in its per-signature K=3 fit; Script 38's report does not. v3 cites Script 39 for this number. 5. "Max fold-to-fold deviation" wording made precise (v2:65, v2:107). The $0.028$ value is the max absolute deviation from the across-fold mean (Script 36 stability summary), not the pairwise across-fold range (which is $0.0376 = 0.9756 - 0.9380$). v3 reports both statistics with explicit definitions. Also: draft note at top now records v2 (round-21) and v3 (round-22) revision lineage. Cross-reference index and open- question block retained as author working checklist (will be removed before manuscript submission per codex e7). Outstanding open questions reduced to 3 (codex round-22 view): - Five-way moderate-confidence band: validate in Big-4 specifically (Phase 3 §IV-F work) or document as inherited from v3.x? - Firm anonymisation policy in §IV-V (procedural) - §IV table numbering (procedural; defer until §IV done) Phase 2 §III draft is now Minor-Revision-quality. Ready for Phase 3 (Results regeneration §IV). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 16:26:02 +08:00
gbanyanandClaude Opus 4.7	62a22ceb83	Revise §III v4.0 draft per codex round-21 review (Major Revision -> v2) Codex gpt-5.5 xhigh review of v1 draft returned Major Revision with 5 Major findings + 7 Minor + editorial nits. v2 addresses all of them. Key v2 changes: 1. Primary classifier declared: inherited v3.x five-way per-signature box rule. K=3 mixture is demoted to accountant-level descriptive characterisation (Script 35 / Script 38 footing), explicitly NOT used to assign signature- or document-level labels. 2. §III-J reframed as "Mixture Model and Accountant-Level Characterisation" (was "Mixture Model and Operational Threshold Derivation"). K=3 LOOO P2_PARTIAL verdict surfaced in prose including the "not predictively useful as an operational classifier" interpretation from the Script 37 verdict legend. 3. §III-K renamed "Convergent Internal-Consistency Checks" (was "Convergent Validation") with explicit caveat that the three scores share underlying features and are not statistically independent measurements. 4. §III-H reverse-anchor paragraph rewritten: the directional error in v1 (the non-Big-4 reference described as a "more- replicated-population baseline") is corrected -- the reference is in fact in the LESS-replicated regime relative to Big-4, and the score measures deviation in the hand-leaning direction. 5. Pixel-identity metric renamed from "FAR" to "positive-anchor miss rate" with explicit conservative-subset caveat ("near-tautological for the box rule because byte-identical => cosine ~1 / dHash ~0"). 6. §III-L title changed to "Signature- and Document-Level Classification" (was "Per-Document Classification") and reorganised so the per-signature five-way rule + document-level worst-case aggregation are both clearly under this section. 7. Empirical slips corrected: - K=2 LOOO comparison: now correctly says "5.6x the stability tolerance 0.005" rather than "5.6x the bootstrap CI half-width"; full-Big-4 bootstrap half-width 0.0015 cited separately. - all-non-Firm-A dip: now correctly (0.998, 0.907), not "p > 0.99". - BD/McCrary: now narrowed to Big-4 scope (Script 34 null), with Script 32 dHash transitions for non-Big-4 subsets noted but not used as operational thresholds. - Firm A byte-identical "50 partners of 180 registered, 35 cross-year" -- now explicitly inherited from v3.x §IV-F.1 / Script 28 / Appendix B; provenance row in the new table flags this as inherited, not v4-regenerated. - "mid/small-firm tail actively pulling" -> "the full-sample and Big-4-only calibrations differ" (causal language softened). - $\Delta\text{BIC}$ sign convention: explicit "lower BIC is preferred; BIC(K=3) - BIC(K=2) = -3.48". 8. Editorial nits applied: - "failure rate" -> "box-rule hand-leaning rate" - "boundary moves modestly" -> "membership remains composition-sensitive" - "calibration uncertainty band +/- 5-13 pp" -> "observed absolute differences of 1.8-12.8 pp, with Firm C exceeding the 5 pp viability bar" - "strongest single methodology-validation signal" -> "strongest internal-consistency signal" - "the same component structure recovers" -> "a broadly similar three-component ordering recovers" - Cross-reference index marked as author checklist (remove before submission). 9. New provenance table at end of §III mapping every numerical claim to (script, source, direct/derived/inherited). 10. Open questions reduced from 5 to 3 (codex resolved questions 2, 3, 4 with concrete answers); remaining 3 are forward-looking (5-way moderate band, pseudonym consistency, table numbering). Also commits: paper/codex_review_gpt55_v4_round1.md (codex review artifact, 143 lines). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 15:49:59 +08:00
gbanyanandClaude Opus 4.7	a06e9456e6	Add Phase 2 §III-G..L methodology rewrite (v4.0 draft) Single consolidated draft of Section III sub-sections G through L, replacing the v3.20.0 §III-G..L block with the Big-4 reframe. Sub-sections (note: G/H/I/J/K/L written together to keep cross- references coherent; user originally requested G/I/J/L only but H rewrite and new K were required for cohesion): G Unit of Analysis and Scope -- accountant unit defined; Big-4 scope justified by within-pool homogeneity, dip-test multimodality, LOOO feasibility. H Reference Populations -- Firm A pivots from "calibration anchor" to "templated-end case study"; non-Big-4 added as reverse-anchor reference. I Distributional Characterisation -- dip-test multimodality at Big-4 level (p < 1e-4 both axes); BD/McCrary null as honest density-smoothness diagnostic. J Mixture Model and Operational Threshold Derivation -- K=2 vs K=3 fits reported; K=3 selected with rationale deferred to §III-K LOOO evidence. K Convergent Validation (NEW in v4.0) -- three-lens Spearman convergence (rho >= 0.879); per-signature K=3 fit (kappa = 0.870 vs per-CPA); K=2 LOOO UNSTABLE / K=3 LOOO PARTIAL; pixel-identity FAR 0% on 262 ground-truth signatures. L Per-Document Classification -- inherits v3.x five-way box rule for continuity; K=3 alternative output documented. Includes: cross-reference index, script-to-section evidence map (linking each empirical claim to the spike Script 32-40 commit), and 5 open questions flagged at the end for partner / reviewer review of this draft. Output: paper/v4/paper_a_methodology_v4_section_iii.md (single file replacing the v3.20.0 §III-G..L block on this branch only; v3.20.0 paper/paper_a_methodology_v3.md left untouched). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 15:15:36 +08:00