Correct Firm A framing: replication-dominated, not pure
Interview evidence from multiple Firm A accountants confirms that MOST use replication (stamping / firm-level e-signing) but a MINORITY may still hand-sign. Firm A is therefore a "replication-dominated" population, not a "pure" one. This framing is consistent with: - 92.5% of Firm A signatures exceed cosine 0.95 (majority replication) - The long left tail (~7%) captures the minority hand-signers, not scan noise or preprocessing artifacts - Hartigan dip test: Firm A cosine unimodal long-tail (p=0.17) - Accountant-level GMM: of 180 Firm A accountants, 139 cluster in C1 (high-replication) and 32 in C2 (middle band = minority hand-signers) Updates docstrings and report text in Scripts 15, 16, 18, 19 to match. Partner v3's "near-universal non-hand-signing" language corrected. Script 19 regenerated with the updated text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -10,6 +10,17 @@ Purpose:
|
|||||||
Prior finding (2026-04-16): signature-level distribution is unimodal long-tail;
|
Prior finding (2026-04-16): signature-level distribution is unimodal long-tail;
|
||||||
the story is that bimodality only emerges at the accountant level.
|
the story is that bimodality only emerges at the accountant level.
|
||||||
|
|
||||||
|
Firm A framing (2026-04-20, corrected):
|
||||||
|
Interviews with multiple Firm A accountants confirm that MOST use
|
||||||
|
replication (stamping / firm-level e-signing) but do NOT exclude a
|
||||||
|
minority of hand-signers. Firm A is therefore a "replication-dominated"
|
||||||
|
population, NOT a "pure" one. This framing is consistent with:
|
||||||
|
- 92.5% of Firm A signatures exceed cosine 0.95
|
||||||
|
- The long left tail (7.5% below 0.95) captures the minority
|
||||||
|
hand-signers, not scan noise
|
||||||
|
- Script 18: of 180 Firm A accountants, 139 cluster in C1
|
||||||
|
(high-replication) and 32 in C2 (middle band = minority hand-signers)
|
||||||
|
|
||||||
Tests:
|
Tests:
|
||||||
1. Firm A (Deloitte) cosine max-similarity -> expected UNIMODAL
|
1. Firm A (Deloitte) cosine max-similarity -> expected UNIMODAL
|
||||||
2. Firm A (Deloitte) independent min dHash -> expected UNIMODAL
|
2. Firm A (Deloitte) independent min dHash -> expected UNIMODAL
|
||||||
|
|||||||
@@ -306,8 +306,10 @@ def main():
|
|||||||
'* Multiple candidate transitions are ranked by total |Z| magnitude',
|
'* Multiple candidate transitions are ranked by total |Z| magnitude',
|
||||||
' on both sides of the boundary; the strongest is reported.',
|
' on both sides of the boundary; the strongest is reported.',
|
||||||
'* Absence of a significant transition is itself informative: it',
|
'* Absence of a significant transition is itself informative: it',
|
||||||
' is consistent with a single generative mechanism (e.g. Firm A',
|
' is consistent with a single dominant generative mechanism (e.g.',
|
||||||
' which is near-universally non-hand-signed).',
|
' Firm A, a replication-dominated population per interviews with',
|
||||||
|
' multiple Firm A accountants -- most use replication, a minority',
|
||||||
|
' may hand-sign).',
|
||||||
]
|
]
|
||||||
md_path = OUT / 'bd_mccrary_report.md'
|
md_path = OUT / 'bd_mccrary_report.md'
|
||||||
md_path.write_text('\n'.join(md), encoding='utf-8')
|
md_path.write_text('\n'.join(md), encoding='utf-8')
|
||||||
|
|||||||
@@ -19,6 +19,14 @@ The script:
|
|||||||
4. For the 2-component fit derives the natural threshold (crossing of
|
4. For the 2-component fit derives the natural threshold (crossing of
|
||||||
marginal densities in cosine-mean and dhash-mean).
|
marginal densities in cosine-mean and dhash-mean).
|
||||||
|
|
||||||
|
Firm A framing note (2026-04-20, corrected):
|
||||||
|
Interviews with Firm A accountants confirm MOST use replication but a
|
||||||
|
MINORITY may hand-sign. Firm A is thus a "replication-dominated"
|
||||||
|
population, NOT pure. Empirically: of ~180 Firm A accountants, ~139
|
||||||
|
land in C1 (high-replication) and ~32 land in C2 (middle band) under
|
||||||
|
the 3-component fit. The C2 Firm A members are the interview-suggested
|
||||||
|
minority hand-signers.
|
||||||
|
|
||||||
Output:
|
Output:
|
||||||
reports/accountant_mixture/accountant_mixture_report.md
|
reports/accountant_mixture/accountant_mixture_report.md
|
||||||
reports/accountant_mixture/accountant_mixture_results.json
|
reports/accountant_mixture/accountant_mixture_results.json
|
||||||
|
|||||||
@@ -11,9 +11,15 @@ occurring reference populations instead of manual labels:
|
|||||||
=> absolute ground truth for replication.
|
=> absolute ground truth for replication.
|
||||||
|
|
||||||
Positive anchor 2: Firm A (Deloitte) signatures
|
Positive anchor 2: Firm A (Deloitte) signatures
|
||||||
Interview + visual evidence establishes near-universal non-hand-
|
Interview evidence from multiple Firm A accountants confirms that
|
||||||
signing across 2013-2023 (see memories 2026-04-08, 2026-04-14).
|
MOST use replication (stamping / firm-level e-signing) but a
|
||||||
We treat Firm A as a strong prior positive.
|
MINORITY may still hand-sign. Firm A is therefore a
|
||||||
|
"replication-dominated" population (not a pure one). We use it as
|
||||||
|
a strong prior positive for the majority regime, while noting that
|
||||||
|
~7% of Firm A signatures fall below cosine 0.95 consistent with
|
||||||
|
the minority hand-signers. This matches the long left tail
|
||||||
|
observed in the dip test (Script 15) and the Firm A members who
|
||||||
|
land in C2 (middle band) of the accountant-level GMM (Script 18).
|
||||||
|
|
||||||
Negative anchor: signatures with cosine <= low threshold
|
Negative anchor: signatures with cosine <= low threshold
|
||||||
Pairs with very low cosine similarity cannot plausibly be pixel
|
Pairs with very low cosine similarity cannot plausibly be pixel
|
||||||
@@ -354,7 +360,11 @@ def main():
|
|||||||
f'({int(neg_mask.sum()):,} signatures). Treated as',
|
f'({int(neg_mask.sum()):,} signatures). Treated as',
|
||||||
' confirmed not-replicated.',
|
' confirmed not-replicated.',
|
||||||
f'* **Firm A anchor:** Deloitte ({int(firm_a_mask.sum()):,} signatures),',
|
f'* **Firm A anchor:** Deloitte ({int(firm_a_mask.sum()):,} signatures),',
|
||||||
' near-universally non-hand-signed per partner interviews.',
|
' a replication-dominated population per interviews with multiple',
|
||||||
|
' Firm A accountants: most use replication (stamping / firm-level',
|
||||||
|
' e-signing), but a minority may still hand-sign. Used as a strong',
|
||||||
|
' prior positive for the majority regime, with the ~7% below',
|
||||||
|
' cosine 0.95 reflecting the minority hand-signers.',
|
||||||
'',
|
'',
|
||||||
'## Equal Error Rate (EER)',
|
'## Equal Error Rate (EER)',
|
||||||
'',
|
'',
|
||||||
|
|||||||
Reference in New Issue
Block a user