From 68689c9f9b23d1286441f512aacb36f94686f150 Mon Sep 17 00:00:00 2001 From: gbanyan Date: Mon, 20 Apr 2026 21:57:16 +0800 Subject: [PATCH] Correct Firm A framing: replication-dominated, not pure Interview evidence from multiple Firm A accountants confirms that MOST use replication (stamping / firm-level e-signing) but a MINORITY may still hand-sign. Firm A is therefore a "replication-dominated" population, not a "pure" one. This framing is consistent with: - 92.5% of Firm A signatures exceed cosine 0.95 (majority replication) - The long left tail (~7%) captures the minority hand-signers, not scan noise or preprocessing artifacts - Hartigan dip test: Firm A cosine unimodal long-tail (p=0.17) - Accountant-level GMM: of 180 Firm A accountants, 139 cluster in C1 (high-replication) and 32 in C2 (middle band = minority hand-signers) Updates docstrings and report text in Scripts 15, 16, 18, 19 to match. Partner v3's "near-universal non-hand-signing" language corrected. Script 19 regenerated with the updated text. Co-Authored-By: Claude Opus 4.7 (1M context) --- signature_analysis/15_hartigan_dip_test.py | 11 +++++++++++ .../16_bd_mccrary_discontinuity.py | 6 ++++-- signature_analysis/18_accountant_mixture.py | 8 ++++++++ .../19_pixel_identity_validation.py | 18 ++++++++++++++---- 4 files changed, 37 insertions(+), 6 deletions(-) diff --git a/signature_analysis/15_hartigan_dip_test.py b/signature_analysis/15_hartigan_dip_test.py index b86e55c..f747ef3 100644 --- a/signature_analysis/15_hartigan_dip_test.py +++ b/signature_analysis/15_hartigan_dip_test.py @@ -10,6 +10,17 @@ Purpose: Prior finding (2026-04-16): signature-level distribution is unimodal long-tail; the story is that bimodality only emerges at the accountant level. +Firm A framing (2026-04-20, corrected): + Interviews with multiple Firm A accountants confirm that MOST use + replication (stamping / firm-level e-signing) but do NOT exclude a + minority of hand-signers. Firm A is therefore a "replication-dominated" + population, NOT a "pure" one. This framing is consistent with: + - 92.5% of Firm A signatures exceed cosine 0.95 + - The long left tail (7.5% below 0.95) captures the minority + hand-signers, not scan noise + - Script 18: of 180 Firm A accountants, 139 cluster in C1 + (high-replication) and 32 in C2 (middle band = minority hand-signers) + Tests: 1. Firm A (Deloitte) cosine max-similarity -> expected UNIMODAL 2. Firm A (Deloitte) independent min dHash -> expected UNIMODAL diff --git a/signature_analysis/16_bd_mccrary_discontinuity.py b/signature_analysis/16_bd_mccrary_discontinuity.py index 26cf530..8a8f657 100644 --- a/signature_analysis/16_bd_mccrary_discontinuity.py +++ b/signature_analysis/16_bd_mccrary_discontinuity.py @@ -306,8 +306,10 @@ def main(): '* Multiple candidate transitions are ranked by total |Z| magnitude', ' on both sides of the boundary; the strongest is reported.', '* Absence of a significant transition is itself informative: it', - ' is consistent with a single generative mechanism (e.g. Firm A', - ' which is near-universally non-hand-signed).', + ' is consistent with a single dominant generative mechanism (e.g.', + ' Firm A, a replication-dominated population per interviews with', + ' multiple Firm A accountants -- most use replication, a minority', + ' may hand-sign).', ] md_path = OUT / 'bd_mccrary_report.md' md_path.write_text('\n'.join(md), encoding='utf-8') diff --git a/signature_analysis/18_accountant_mixture.py b/signature_analysis/18_accountant_mixture.py index d0617ed..93f95d8 100644 --- a/signature_analysis/18_accountant_mixture.py +++ b/signature_analysis/18_accountant_mixture.py @@ -19,6 +19,14 @@ The script: 4. For the 2-component fit derives the natural threshold (crossing of marginal densities in cosine-mean and dhash-mean). +Firm A framing note (2026-04-20, corrected): + Interviews with Firm A accountants confirm MOST use replication but a + MINORITY may hand-sign. Firm A is thus a "replication-dominated" + population, NOT pure. Empirically: of ~180 Firm A accountants, ~139 + land in C1 (high-replication) and ~32 land in C2 (middle band) under + the 3-component fit. The C2 Firm A members are the interview-suggested + minority hand-signers. + Output: reports/accountant_mixture/accountant_mixture_report.md reports/accountant_mixture/accountant_mixture_results.json diff --git a/signature_analysis/19_pixel_identity_validation.py b/signature_analysis/19_pixel_identity_validation.py index 7090059..e748192 100644 --- a/signature_analysis/19_pixel_identity_validation.py +++ b/signature_analysis/19_pixel_identity_validation.py @@ -11,9 +11,15 @@ occurring reference populations instead of manual labels: => absolute ground truth for replication. Positive anchor 2: Firm A (Deloitte) signatures - Interview + visual evidence establishes near-universal non-hand- - signing across 2013-2023 (see memories 2026-04-08, 2026-04-14). - We treat Firm A as a strong prior positive. + Interview evidence from multiple Firm A accountants confirms that + MOST use replication (stamping / firm-level e-signing) but a + MINORITY may still hand-sign. Firm A is therefore a + "replication-dominated" population (not a pure one). We use it as + a strong prior positive for the majority regime, while noting that + ~7% of Firm A signatures fall below cosine 0.95 consistent with + the minority hand-signers. This matches the long left tail + observed in the dip test (Script 15) and the Firm A members who + land in C2 (middle band) of the accountant-level GMM (Script 18). Negative anchor: signatures with cosine <= low threshold Pairs with very low cosine similarity cannot plausibly be pixel @@ -354,7 +360,11 @@ def main(): f'({int(neg_mask.sum()):,} signatures). Treated as', ' confirmed not-replicated.', f'* **Firm A anchor:** Deloitte ({int(firm_a_mask.sum()):,} signatures),', - ' near-universally non-hand-signed per partner interviews.', + ' a replication-dominated population per interviews with multiple', + ' Firm A accountants: most use replication (stamping / firm-level', + ' e-signing), but a minority may still hand-sign. Used as a strong', + ' prior positive for the majority regime, with the ~7% below', + ' cosine 0.95 reflecting the minority hand-signers.', '', '## Equal Error Rate (EER)', '',