Paper A v3.20.0: partner Jimmy 2026-04-27 review + DOCX rendering overhaul
Substantive content (addresses partner Jimmy's 2026-04-27 review of v3.19.1): Must-fix items (6/6): - §III-F SSIM/pixel rejection rewritten from first principles (design-level argument from luminance/contrast/structure local-window product, not the prior empirical 0.70 result) - Table VI restructured by population × method; added missing Firm A logit-Gaussian-2 0.999 row; KDE marked undefined (unimodal), BD/McCrary marked bin-unstable (Appendix A) - Tables IX / XI / §IV-F.3 dHash 5/8/15 inconsistency resolved: ≤8 demoted from "operational dual" to "calibration-fold-adjacent reference"; the actual classifier rule cos>0.95 AND dH≤15 = 92.46% added throughout - New Fig. 4 (yearly per-firm best-match cosine, 5 lines, 2013-2023, Firm A on top); script 30_yearly_big4_comparison.py - Tables XIV / XV extended with top-20% (94.8%) and top-30% (81.3%) brackets - §III-K reframed P7.5 from "round-number lower-tail boundary" to operating point; new Table XII-B (cosine-FAR-capture tradeoff at 5 thresholds: 0.9407 / 0.945 / 0.95 / 0.977 / 0.985) Nice-to-have items (3/3): - Table XII expanded to 6-cut classifier sensitivity grid (0.940-0.985) - Defensive parentheticals (84,386 vs 85,042; 30,226 vs 30,222) moved to table notes; cut "invite reviewer skepticism" and "non-load-bearing" Codex 3-pass verification cleanup: - Stale 0.973/0.977/0.979 references unified on canonical 0.977 (Firm A Beta-2 forced-fit crossing from beta_mixture_results.json) - dHash≤8 wording corrected to P95-adjacent (P95 = 9, ≤8 is the integer immediately below) instead of misleading "rounded down" - Table XII-B prose corrected: per-segment qualification of "non-Firm-A capture falls faster" (true on 0.95→0.977 segment but contracts on 0.977→0.985 segment); arithmetic now from exact counts Within-year analyses removed: - Within-year ranking robustness check (Class A) was added in nice-to-have pass but contradicts v3.14 A2-removal stance; removed from §IV-G.2 + the Appendix B provenance row - Within-CPA future-work disclosures (Class B) removed from Discussion limitation #5 and Conclusion future-work paragraph; subsequent limitations renumbered Sixth → Fifth, Seventh → Sixth DOCX rendering pipeline overhaul (paper/export_v3.py): Critical fix - every v3 DOCX since v3.0 was shipping WITHOUT TABLES: strip_comments() was wholesale-deleting HTML comments, but every numerical table is wrapped in <!-- TABLE X: ... -->, so the table body was deleted alongside the wrapper. Now unwraps TABLE comments (emit synthetic __TABLE_CAPTION__: marker + table body) while still stripping non-TABLE editorial comments. Result: 19 tables now render in the DOCX. Other rendering fixes: - LaTeX → Unicode conversion (50+ token replacements: Greek alphabet, ≤≥, ×·≈, →↔⇒, etc.); \frac/\sqrt linearisation; TeX brace tricks ({=}, {,}) - Math-context-scoped sub/superscript via PUA sentinels (/): no more underscore-eating in identifiers like signature_analysis - Display equations rendered via matplotlib mathtext to PNG (3 equations: cosine sim, mixture crossing, BD/McCrary Z statistic), embedded as numbered equation blocks (1), (2), (3); content-addressed cache at paper/equations/ (gitignored, regenerable) - Manual numbered/bulleted list rendering with hanging indent (replaces python-docx style="List Number" which silently drops the number prefix when no numbering definition is bound) - Markdown blockquote (> ...) defensively stripped - Pandoc footnote ([^name]) markers no longer leak (inlined at source) - Heading text cleaned of LaTeX residue + PUA sentinels - File paths in body text (signature_analysis/X.py, reports/Y.json) trimmed to "(reproduction artifact in Appendix B)" pointers New leak linter: paper/lint_paper_v3.py - two-pass markdown source + rendered DOCX leak detector; auto-runs at end of export_v3.py. Script changes: - 21_expanded_validation.py: added 0.9407, 0.977, 0.985 to canonical FAR threshold list so Table XII-B is reproducible from persisted JSON - 30_yearly_big4_comparison.py: NEW; generates Fig. 4 + per-firm yearly data (writes to reports/figures/ and reports/firm_yearly_comparison/) - 31_within_year_ranking_robustness.py: NEW; supports the within-year robustness check (no longer cited in paper but kept as repo-internal due-diligence artifact) Partner handoff DOCX shipped to ~/Downloads/Paper_A_IEEE_Access_Draft_v3.20.0_20260505.docx (536 KB: 19 tables + 4 figures + 3 equation images). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+138
-56
@@ -102,17 +102,30 @@ The three diagnostics agree that per-signature similarity does not form a clean
|
||||
Table VI summarises the signature-level threshold-estimator outputs for cross-method comparison.
|
||||
|
||||
<!-- TABLE VI: Signature-Level Threshold-Estimator Summary
|
||||
| Method | Cosine | dHash |
|
||||
|--------|--------|-------|
|
||||
| All-pairs KDE crossover (Section IV-C) | 0.837 | — |
|
||||
| Beta-2 EM crossing (Firm A; forced fit, BIC prefers $K{=}3$) | 0.977 | — |
|
||||
| logit-GMM-2 crossing (Full sample; forced fit) | 0.980 | — |
|
||||
| BD/McCrary transition (diagnostic; bin-unstable, Appendix A) | 0.985 | 2.0 |
|
||||
| Firm A whole-sample P7.5 (operational anchor; Section III-K) | 0.95 | — |
|
||||
| Firm A whole-sample dHash_indep P75 (operational $\leq 5$ band lower edge) | — | 4 |
|
||||
| Firm A whole-sample dHash_indep ceiling (style-consistency boundary) | — | 15 |
|
||||
| Firm A calibration-fold cosine P5 (held-out validation; Section IV-F.2) | 0.9407 | — |
|
||||
| Firm A calibration-fold dHash_indep P95 (held-out validation) | — | 9 |
|
||||
| Population | Method | Cosine threshold | dHash threshold | Status |
|
||||
|------------|--------|------------------|-----------------|--------|
|
||||
| **Threshold estimators (signature-level distributional fits)** | | | | |
|
||||
| Firm A signature-level | KDE antimode + Hartigan dip (Section III-I.1) | undefined | — | unimodal at $\alpha=0.05$ ($p=0.169$); antimode not defined for unimodal data |
|
||||
| Firm A signature-level | Beta-2 EM crossing (Section III-I.2) | 0.977 | — | forced fit; BIC strongly prefers $K{=}3$ ($\Delta\text{BIC} = 381$) |
|
||||
| Firm A signature-level | logit-Gaussian-2 crossing (robustness check) | 0.999 | — | forced fit; sharply inconsistent with Beta-2 crossing—reflects parametric-form sensitivity |
|
||||
| Full-sample signature-l. | KDE antimode + Hartigan dip | (multiple modes) | — | multimodal ($p<0.001$); KDE crossover at full-sample is dominated by between-firm heterogeneity |
|
||||
| Full-sample signature-l. | Beta-2 EM crossing | no crossing | — | forced fit; component densities do not cross over $[0,1]$ under recovered parameters |
|
||||
| Full-sample signature-l. | logit-Gaussian-2 crossing | 0.980 | — | forced fit; BIC strongly prefers $K{=}3$ ($\Delta\text{BIC} = 10{,}175$) |
|
||||
| **Density-smoothness diagnostics (not threshold estimators)** | | | | |
|
||||
| Firm A signature-level | BD/McCrary candidate transition (Section III-I.3) | 0.985 (bin 0.005)| 2.0 (bin 1) | bin-unstable across $\{0.003, 0.005, 0.010, 0.015\}$ (Appendix A); transition lies *inside* the non-hand-signed mode |
|
||||
| Full-sample signature-l. | BD/McCrary candidate transition | 0.985 (bin 0.005) | 2.0 (bin 1) | bin-unstable across $\{0.003, 0.005, 0.010, 0.015\}$ (Appendix A) |
|
||||
| **Reference: between-class KDE (different unit of analysis)** | | | | |
|
||||
| All-pairs intra/inter (pair-level; Section IV-C) | KDE crossover | 0.837 | — | reference point for the Uncertain/Likely-hand-signed boundary in the operational classifier |
|
||||
| **Operational classifier anchors and percentile cross-references** | | | | |
|
||||
| Firm A whole-sample | P7.5 (operational anchor; Section III-K) | 0.95 | — | operational cosine cut for the five-way classifier |
|
||||
| Firm A whole-sample | dHash$_\text{indep}$ P75 | — | 4 | informs the $\leq 5$ high-confidence band edge in the classifier |
|
||||
| Firm A whole-sample | dHash$_\text{indep}$ style-consistency ceiling | — | 15 | operational $> 15$ style-consistency boundary |
|
||||
| Firm A calibration fold (70%) | cosine P5 (Section IV-F.2) | 0.9407 | — | calibration-fold cross-reference; held-out fold reports rates at this cut |
|
||||
| Firm A calibration fold (70%) | dHash$_\text{indep}$ P95 | — | 9 | calibration-fold cross-reference (Tables IX and XI report rates at the rounded $\leq 8$ cut for continuity) |
|
||||
|
||||
Read this table by *population × method*: each row reports one method applied to one population.
|
||||
The first three blocks (threshold estimators; density-smoothness diagnostics; between-class KDE) are *characterisation* outputs; the bottom block is the operational anchor set used by the classifier of Section III-K.
|
||||
The disagreement between Firm A Beta-2 (0.977) and Firm A logit-Gaussian-2 (0.999) is the parametric-form sensitivity referenced in the prose of Section IV-D.3; it cannot be resolved from the data because BIC rejects the underlying $K{=}2$ assumption itself.
|
||||
-->
|
||||
|
||||
Non-hand-signed replication quality is therefore best read as a continuous spectrum produced by firm-specific reproduction technologies (administrative stamping in early years, firm-level e-signing later) acting on a common stored exemplar.
|
||||
@@ -126,20 +139,30 @@ Table IX reports the proportion of Firm A signatures crossing each candidate thr
|
||||
<!-- TABLE IX: Firm A Whole-Sample Capture Rates (consistency check, NOT external validation)
|
||||
| Rule | Firm A rate | k / N |
|
||||
|------|-------------|-------|
|
||||
| cosine > 0.837 (all-pairs KDE crossover) | 99.93% | 60,408 / 60,448 |
|
||||
| cosine > 0.9407 (calibration-fold P5) | 95.15% | 57,518 / 60,448 |
|
||||
| cosine > 0.945 (calibration-fold P5 rounded) | 94.02% | 56,836 / 60,448 |
|
||||
| cosine > 0.95 | 92.51% | 55,922 / 60,448 |
|
||||
| dHash_indep ≤ 5 (whole-sample upper-tail of mode) | 84.20% | 50,897 / 60,448 |
|
||||
| dHash_indep ≤ 8 | 95.17% | 57,527 / 60,448 |
|
||||
| dHash_indep ≤ 15 (style-consistency boundary) | 99.83% | 60,348 / 60,448 |
|
||||
| cosine > 0.95 AND dHash_indep ≤ 8 (operational dual) | 89.95% | 54,370 / 60,448 |
|
||||
| **Cosine-only marginal rates** | | |
|
||||
| cosine > 0.837 (all-pairs KDE crossover) | 99.93% | 60,408 / 60,448 |
|
||||
| cosine > 0.9407 (calibration-fold P5) | 95.15% | 57,518 / 60,448 |
|
||||
| cosine > 0.945 (calibration-fold P5 rounded) | 94.02% | 56,836 / 60,448 |
|
||||
| cosine > 0.95 (operational; whole-sample Firm A P7.5) | 92.51% | 55,922 / 60,448 |
|
||||
| **dHash-only marginal rates** | | |
|
||||
| dHash_indep ≤ 5 (operational high-confidence cap) | 84.20% | 50,897 / 60,448 |
|
||||
| dHash_indep ≤ 8 (calibration-fold P95 rounded) | 95.17% | 57,527 / 60,448 |
|
||||
| dHash_indep ≤ 15 (operational style-consistency boundary) | 99.83% | 60,348 / 60,448 |
|
||||
| **Operational classifier dual rules (Section III-K)** | | |
|
||||
| cosine > 0.95 AND dHash_indep ≤ 5 (high-confidence non-hand-signed) | 81.70% | 49,389 / 60,448 |
|
||||
| cosine > 0.95 AND 5 < dHash_indep ≤ 15 (moderate-confidence) | 10.76% | 6,503 / 60,448 |
|
||||
| cosine > 0.95 AND dHash_indep ≤ 15 (combined non-hand-signed) | 92.46% | 55,892 / 60,448 |
|
||||
| **Calibration-fold-adjacent cross-reference (not the operational classifier rule)** | | |
|
||||
| cosine > 0.95 AND dHash_indep ≤ 8 | 89.95% | 54,370 / 60,448 |
|
||||
|
||||
All rates computed exactly from the full Firm A sample (N = 60,448 signatures); per-rule counts and codes are available in the supplementary materials.
|
||||
The two operational dHash cuts ($\leq 5$ for the high-confidence cap and $\leq 15$ for the style-consistency boundary) come from the classifier definition in Section III-K and are the rules used by the five-way classifier of Tables XII and XVII; the dHash $\leq 8$ row is *not* an operational classifier rule but a calibration-fold-adjacent reference (Section IV-F.2 calibration-fold dHash P95 = 9; we report the $\leq 8$ rate as the integer-valued threshold immediately below P95, included here so that Firm A capture in the calibration-fold-P95 neighbourhood can be read off the same table).
|
||||
-->
|
||||
|
||||
Table IX is a whole-sample consistency check rather than an external validation: the thresholds 0.95, dHash median, and dHash 95th percentile are themselves anchored to the whole-sample Firm A distribution described in Section III-K (the 70/30 calibration-fold thresholds of Table XI are separate and slightly different, e.g., calibration-fold cosine P5 = 0.9407 rather than the whole-sample heuristic 0.95).
|
||||
The dual rule cosine $> 0.95$ AND dHash $\leq 8$ captures 89.95% of Firm A, a value that is consistent with the dip-test-confirmed unimodal-long-tail shape of Firm A's per-signature cosine distribution (Section IV-D.1) and the 92.5% / 7.5% signature-level split (Section III-H).
|
||||
Table IX is a whole-sample consistency check rather than an external validation: the cosine cut $0.95$ and the operational dHash band edges ($\leq 5$ high-confidence cap and $\leq 15$ style-consistency boundary) are themselves anchored to the whole-sample Firm A distribution described in Section III-K (the 70/30 calibration-fold thresholds of Table XI are separate and slightly different, e.g., calibration-fold cosine P5 = 0.9407 rather than the whole-sample heuristic 0.95).
|
||||
The operational dual rule used by the five-way classifier of Section III-K---cosine $> 0.95$ AND $\text{dHash}_\text{indep} \leq 15$ (the union of the high-confidence and moderate-confidence non-hand-signed buckets)---captures 92.46% of Firm A; the high-confidence component alone (cosine $> 0.95$ AND $\text{dHash}_\text{indep} \leq 5$) captures 81.70%.
|
||||
For continuity with prior calibration-fold reporting (Section IV-F.2 reports the calibration-fold rate at the calibration-fold-P95-adjacent cut $\text{dHash}_\text{indep} \leq 8$), Table IX also lists the cosine $> 0.95$ AND $\text{dHash}_\text{indep} \leq 8$ rate of 89.95%; this is *not* the operational classifier rule but a cross-reference value.
|
||||
Both operational rates are consistent with the dip-test-confirmed unimodal-long-tail shape of Firm A's per-signature cosine distribution (Section IV-D.1) and the 92.5% / 7.5% signature-level split (Section III-H).
|
||||
Section IV-F.2 reports the corresponding rates on the 30% Firm A hold-out fold, which provides the external check these whole-sample rates cannot.
|
||||
|
||||
## F. Pixel-Identity, Inter-CPA, and Held-Out Firm A Validation
|
||||
@@ -149,7 +172,7 @@ We report three validation analyses corresponding to the anchors of Section III-
|
||||
### 1) Pixel-Identity Positive Anchor with Inter-CPA Negative Anchor
|
||||
|
||||
Of the 182,328 extracted signatures, 310 have a same-CPA nearest match that is byte-identical after crop and normalization (pixel-identical-to-closest = 1); these form the byte-identity positive anchor---a pair-level proof of image reuse that serves as conservative ground truth for non-hand-signed signatures, subject to the source-template edge case discussed in Section V-G.
|
||||
Within Firm A specifically, 145 of these byte-identical signatures are distributed across 50 distinct partners (of 180 registered Firm A partners), with 35 of the byte-identical pairs spanning different fiscal years; this Firm A decomposition is reproduced by `signature_analysis/28_byte_identity_decomposition.py` and reported in `reports/byte_identity_decomp/byte_identity_decomposition.json` (Appendix B).
|
||||
Within Firm A specifically, 145 of these byte-identical signatures are distributed across 50 distinct partners (of 180 registered Firm A partners), with 35 of the byte-identical pairs spanning different fiscal years; reproduction artifact for this Firm A decomposition is listed in Appendix B.
|
||||
As the gold-negative anchor we sample 50,000 i.i.d. random cross-CPA signature pairs from the full 168,755-signature matched corpus (inter-CPA cosine: mean $= 0.763$, $P_{95} = 0.886$, $P_{99} = 0.915$, max $= 0.992$).
|
||||
Because the positive and negative anchor populations are constructed from different sampling units (byte-identical same-CPA pairs vs random inter-CPA pairs), their relative prevalence in the combined anchor set is arbitrary, and precision / $F_1$ / recall therefore have no meaningful population interpretation.
|
||||
We accordingly report FAR with Wilson 95% confidence intervals against the large inter-CPA negative anchor in Table X.
|
||||
@@ -163,8 +186,8 @@ We do not report an Equal Error Rate: EER is meaningful only when the positive a
|
||||
| 0.900 | 0.0250 | [0.0237, 0.0264] |
|
||||
| 0.945 (calibration-fold P5 rounded) | 0.0008 | [0.0006, 0.0011] |
|
||||
| 0.950 (whole-sample Firm A P7.5; operational cut) | 0.0005 | [0.0003, 0.0007] |
|
||||
| 0.973 (signature-level Beta/KDE forced-fit reference) | 0.0002 | [0.0001, 0.0004] |
|
||||
| 0.979 (signature-level Beta-2 forced-fit crossing) | 0.0001 | [0.0001, 0.0003] |
|
||||
| 0.977 (Firm A Beta-2 forced-fit crossing; Section IV-D) | 0.00014 | [0.00007, 0.00029] |
|
||||
| 0.985 (BD/McCrary candidate transition; Appendix A) | 0.00004 | [0.00001, 0.00015] |
|
||||
|
||||
Table note: We do not include FRR against the byte-identical positive anchor as a column here: the byte-identical subset has cosine $\approx 1$ by construction, so FRR against that subset is trivially $0$ at every threshold below $1$ and carries no biometric information beyond verifying that the threshold does not exceed $1$. The conservative-subset FRR role of the byte-identical anchor is instead discussed qualitatively in Section V-F.
|
||||
-->
|
||||
@@ -193,17 +216,17 @@ Table XI reports both calibration-fold and held-out-fold capture rates with Wils
|
||||
| dHash_indep ≤ 8 | 94.84% [94.63%, 95.04%] | 96.13% [95.82%, 96.43%] | -6.45 | <0.001 | 42,788/45,116 | 14,739/15,332 |
|
||||
| dHash_indep ≤ 9 (calib-fold P95) | 96.65% [96.48%, 96.81%] | 97.48% [97.22%, 97.71%] | -5.07 | <0.001 | 43,604/45,116 | 14,945/15,332 |
|
||||
| dHash_indep ≤ 15 | 99.83% [99.79%, 99.87%] | 99.84% [99.77%, 99.89%] | -0.31 | 0.754 n.s. | 45,040/45,116 | 15,308/15,332 |
|
||||
| cosine > 0.95 AND dHash_indep ≤ 8 | 89.40% [89.12%, 89.68%] | 91.54% [91.09%, 91.97%] | -7.60 | <0.001 | 40,335/45,116 | 14,035/15,332 |
|
||||
| cosine > 0.95 AND dHash_indep ≤ 8 (calibration-fold P95-adjacent reference; P95 = 9) | 89.40% [89.12%, 89.68%] | 91.54% [91.09%, 91.97%] | -7.60 | <0.001 | 40,335/45,116 | 14,035/15,332 |
|
||||
| cosine > 0.95 AND dHash_indep ≤ 15 (operational classifier rule, Section III-K) | 92.09% [91.84%, 92.34%] | 93.56% [93.16%, 93.93%] | -5.93 | <0.001 | 41,548/45,116 | 14,344/15,332 |
|
||||
|
||||
Calibration-fold thresholds: Firm A cosine median = 0.9862, P1 = 0.9067, P5 = 0.9407; dHash_indep median = 2, P95 = 9. Counts and z/p values are reproducible from the supplementary materials (fixed random seed).
|
||||
-->
|
||||
|
||||
Table XI reports both calibration-fold and held-out-fold capture rates with Wilson 95% CIs and a two-proportion $z$-test.
|
||||
We report fold-versus-fold comparisons rather than fold-versus-whole-sample comparisons, because the whole-sample rate is a weighted average of the two folds and therefore cannot, in general, fall inside the Wilson CI of either fold when the folds differ in rate; the correct generalization reference is the calibration fold, which produced the thresholds.
|
||||
|
||||
Under this proper test the two extreme rules agree across folds (cosine $> 0.837$ and $\text{dHash}_\text{indep} \leq 15$; both $p > 0.7$).
|
||||
The operationally relevant rules in the 85–95% capture band differ between folds by 1–5 percentage points ($p < 0.001$ given the $n \approx 45\text{k}/15\text{k}$ fold sizes).
|
||||
Both folds nevertheless sit in the same replication-dominated regime: every calibration-fold rate in the 85–99% range has a held-out counterpart in the 87–99% range, and the operational dual rule cosine $> 0.95$ AND $\text{dHash}_\text{indep} \leq 8$ captures 89.40% of the calibration fold and 91.54% of the held-out fold.
|
||||
Both folds nevertheless sit in the same replication-dominated regime: every calibration-fold rate in the 85–99% range has a held-out counterpart in the 87–99% range, and the calibration-fold-adjacent reference rule cosine $> 0.95$ AND $\text{dHash}_\text{indep} \leq 8$ (the integer cut immediately below the calibration-fold dHash P95 of 9) captures 89.40% of the calibration fold and 91.54% of the held-out fold; the operational classifier rule cosine $> 0.95$ AND $\text{dHash}_\text{indep} \leq 15$ used by the five-way classifier of Section III-K captures still higher rates in both folds (calibration 92.09%, 41,548 / 45,116; held-out 93.56%, 14,344 / 15,332).
|
||||
The modest fold gap is consistent with within-Firm-A heterogeneity in replication intensity: the random 30% CPA sample evidently contained proportionally more high-replication CPAs.
|
||||
We therefore interpret the held-out fold as confirming the qualitative finding (Firm A is strongly replication-dominated across both folds) while cautioning that exact rates carry fold-level sampling noise that a single 30% split cannot eliminate; the threshold-independent partner-ranking analysis (Section IV-G.2) is the cross-check that is robust to this fold variance.
|
||||
|
||||
@@ -214,25 +237,79 @@ We report a sensitivity check in which this round-number cut is replaced by the
|
||||
Table XII reports the five-way classifier output under each cut.
|
||||
|
||||
<!-- TABLE XII: Classifier Sensitivity to the Operational Cosine Cut (All-Sample Five-Way Output, N = 168,740 signatures)
|
||||
| Category | cos > 0.95 count (%) | cos > 0.945 count (%) | Δ count |
|
||||
|--------------------------------------------|----------------------|-----------------------|---------|
|
||||
| High-confidence non-hand-signed | 76,984 (45.62%) | 79,278 (46.98%) | +2,294 |
|
||||
| Moderate-confidence non-hand-signed | 43,906 (26.02%) | 50,001 (29.63%) | +6,095 |
|
||||
| High style consistency | 546 ( 0.32%) | 665 ( 0.39%) | +119 |
|
||||
| Uncertain | 46,768 (27.72%) | 38,260 (22.67%) | -8,508 |
|
||||
| Likely hand-signed | 536 ( 0.32%) | 536 ( 0.32%) | +0 |
|
||||
| Cosine cut | High-confidence | Moderate-confidence | High style consistency | Uncertain | Likely hand-signed |
|
||||
|------------|-----------------|---------------------|------------------------|-----------|--------------------|
|
||||
| cos > 0.940 | 81,069 (48.04%) | 55,308 (32.78%) | 801 (0.47%) | 31,026 (18.39%) | 536 (0.32%) |
|
||||
| cos > 0.945 | 79,278 (46.98%) | 50,001 (29.63%) | 665 (0.39%) | 38,260 (22.67%) | 536 (0.32%) |
|
||||
| cos > 0.950 (operational) | 76,984 (45.62%) | 43,906 (26.02%) | 546 (0.32%) | 46,768 (27.72%) | 536 (0.32%) |
|
||||
| cos > 0.960 | 70,250 (41.63%) | 29,450 (17.45%) | 288 (0.17%) | 68,216 (40.43%) | 536 (0.32%) |
|
||||
| cos > 0.970 | 60,247 (35.70%) | 14,865 ( 8.81%) | 117 (0.07%) | 92,975 (55.10%) | 536 (0.32%) |
|
||||
| cos > 0.985 | 37,368 (22.15%) | 2,231 ( 1.32%) | 10 (0.01%) | 128,595 (76.21%) | 536 (0.32%) |
|
||||
|
||||
The dHash band edges ($\leq 5$ for high-confidence, $5 < \text{dHash}_\text{indep} \leq 15$ for moderate-confidence, $> 15$ for style) are held fixed across the grid; only the cosine cut varies. The Likely-hand-signed count is invariant across the grid because it depends only on the all-pairs KDE crossover cosine $= 0.837$.
|
||||
-->
|
||||
|
||||
At the aggregate firm-level, the operational dual rule cos $> 0.95$ AND $\text{dHash}_\text{indep} \leq 8$ captures 89.95% of whole Firm A under the 0.95 cut and 91.14% under the 0.945 cut---a shift of 1.19 percentage points.
|
||||
At the aggregate firm-level, the calibration-fold-adjacent reference dual rule cos $> 0.95$ AND $\text{dHash}_\text{indep} \leq 8$ captures 89.95% of whole Firm A under the 0.95 cut and 91.14% under the 0.945 cut---a shift of 1.19 percentage points.
|
||||
The operational classifier rule cos $> 0.95$ AND $\text{dHash}_\text{indep} \leq 15$ used by the five-way classifier of Section III-K captures 92.46% under the 0.95 cut and 93.97% under the 0.945 cut---a shift of 1.51 percentage points.
|
||||
|
||||
Reading the wider grid in Table XII: the High-confidence and Moderate-confidence shares shift by less than 5 percentage points across the 0.940-0.950 neighbourhood, while pushing the cosine cut to 0.970 or 0.985 produces qualitatively different classifier behaviour (Moderate-confidence collapses from 26.02% at $0.95$ to 8.81% at $0.97$ and 1.32% at $0.985$, with the displaced mass landing in Uncertain rather than reclassifying out of the corpus).
|
||||
The classifier output is therefore robust to small (~0.005-cosine) perturbations of the operational cut but not to wholesale reanchoring at the threshold-estimator outputs of Section IV-D, which is consistent with our reading that those outputs are not classifier thresholds.
|
||||
At the per-signature categorization level, replacing 0.95 by 0.945 reclassifies 8,508 signatures (5.04% of the corpus) out of the Uncertain band; 6,095 of them migrate to Moderate-confidence non-hand-signed, 2,294 to High-confidence non-hand-signed, and 119 to High style consistency.
|
||||
The Likely-hand-signed category is unaffected because it depends only on the fixed all-pairs KDE crossover cosine $= 0.837$.
|
||||
The High-confidence non-hand-signed share grows from 45.62% to 46.98%.
|
||||
|
||||
We interpret this sensitivity pattern as indicating that the classifier's aggregate and high-confidence output is robust to the choice of operational cut within a 0.005-cosine neighbourhood of the Firm A P7.5 anchor, and that the movement is concentrated at the Uncertain/Moderate-confidence boundary.
|
||||
The paper therefore retains cos $> 0.95$ as the primary operational cut for transparency (round-number P7.5 of the whole-sample Firm A reference distribution) and reports the 0.945 results as a sensitivity check rather than as a deployed alternative.
|
||||
|
||||
To make the operating-point selection (Section III-K) auditable rather than presented as a single fixed value, Table XII-B reports the capture-vs-FAR tradeoff over the candidate threshold grid spanning the calibration-fold P5 (0.9407), its rounded value (0.945), the operational anchor (0.95), the Firm A Beta-2 forced-fit crossing from Section IV-D.3 (0.977), and the BD/McCrary candidate transition from Section IV-D.2 (0.985).
|
||||
For each grid point we report Firm A capture (under both the cosine-only marginal and the operational dual rule cos $> t$ AND $\text{dHash}_\text{indep} \leq 15$ used by the five-way classifier of Section III-K), non-Firm-A capture (the cosine-only marginal in the 108,292 non-Firm-A matched signatures), and inter-CPA FAR with Wilson 95% CI against the 50,000-pair anchor of Section IV-F.1.
|
||||
|
||||
<!-- TABLE XII-B: Cosine-Threshold Tradeoff: Capture vs Inter-CPA FAR
|
||||
| Cosine cut t | Firm A capture (cos > t) | Firm A capture (cos > t AND dHash_indep ≤ 15) | Non-Firm-A capture (cos > t) | Inter-CPA FAR | Inter-CPA FAR Wilson 95% CI |
|
||||
|--------------|--------------------------|------------------------------------------------|------------------------------|---------------|------------------------------|
|
||||
| 0.9407 (calibration-fold P5) | 95.15% (57,518/60,448) | 95.09% (57,482/60,448) | 72.68% (78,710/108,292) | 0.00126 | [0.00099, 0.00161] |
|
||||
| 0.945 (calibration-fold P5 rounded) | 94.02% (56,836/60,448) | 93.97% (56,804/60,448) | 67.51% (73,108/108,292) | 0.00082 | [0.00061, 0.00111] |
|
||||
| 0.95 (whole-sample Firm A P7.5; **operational cut**) | **92.51%** (55,922/60,448) | **92.46%** (55,892/60,448) | 60.50% (65,514/108,292) | **0.00050** | [0.00034, 0.00074] |
|
||||
| 0.977 (Firm A Beta-2 forced-fit crossing) | 74.53% (45,050/60,448) | 74.51% (45,038/60,448) | 13.14% (14,233/108,292) | 0.00014 | [0.00007, 0.00029] |
|
||||
| 0.985 (BD/McCrary candidate transition) | 55.27% (33,409/60,448) | 55.26% (33,406/60,448) | 5.73% (6,200/108,292) | 0.00004 | [0.00001, 0.00015] |
|
||||
|
||||
Inter-CPA FAR computed against 50,000 i.i.d. inter-CPA pairs (random seed 42, reproducing the anchor of Section IV-F.1 / Table X). Capture and FAR percentages are exact ratios of the displayed integer counts; gap arithmetic in the surrounding prose is computed from those exact counts and rounded to two decimal places. The dual-rule column is the operational classifier rule of Section III-K; for cuts above the dHash-15 saturation point (Firm A dHash$_\text{indep}$ $> 15$ rate is only 0.17%, Table IX), the dual-rule and cosine-only columns coincide to within the dHash$_\text{indep}$ $> 15$ residual.
|
||||
-->
|
||||
|
||||
Reading Table XII-B, three patterns motivate the choice of $0.95$ as the operating point.
|
||||
First, *Firm A capture* on the operational dual rule decays smoothly from 95.09% at $t = 0.9407$ to 55.26% at $t = 0.985$.
|
||||
Relaxing the cut from $0.95$ to $0.945$ buys 1.51 percentage points of additional Firm A capture, and to $0.9407$ buys 2.63 percentage points; tightening from $0.95$ to $0.977$ costs 17.96 percentage points and to $0.985$ costs 37.20 percentage points.
|
||||
The selected cut at $0.95$ is the strictest cut on this grid at which Firm A capture remains above $90\%$ on the operational dual rule.
|
||||
Second, *inter-CPA FAR* is small in absolute terms across the entire candidate grid ($0.00126$ at $0.9407$, falling to $0.00004$ at $0.985$): under any of these operating points the classifier's specificity against random cross-CPA pairs is in the per-mille range or better, so FAR alone does not determine the choice.
|
||||
The marginal FAR cost of relaxing from $0.95$ to $0.945$ is $+0.00032$ ($25 \to 41$ false positives per 50,000 pairs) and to $0.9407$ is $+0.00076$ ($25 \to 63$); the marginal FAR savings from tightening to $0.977$ and $0.985$ are $-0.00036$ and $-0.00046$ respectively.
|
||||
The FAR savings from going stricter are small in absolute terms compared with the corresponding Firm A capture loss, which makes $0.95$ a balanced operating point on this grid rather than a uniquely optimal one.
|
||||
Third, *non-Firm-A capture* (the cosine-only marginal in the 108,292 non-Firm-A signatures) decays from 67.51% at $0.945$ to 60.50% at $0.95$, 13.14% at $0.977$, and 5.73% at $0.985$.
|
||||
The Firm-A-minus-non-Firm-A gap widens with strictness through $0.977$ and then contracts (22.41 percentage points at $0.9407$; 26.46 at $0.945$; 31.97 at $0.95$; 61.36 at $0.977$; 49.54 at $0.985$): on the $0.95 \to 0.977$ segment non-Firm-A capture falls faster than Firm A capture in absolute terms ($-47.35$ vs $-17.96$ percentage points), so the widening is dominated by non-Firm-A removal rather than by an intrinsic property of Firm A; on the $0.977 \to 0.985$ segment Firm A capture falls faster than non-Firm-A's already-low residual, so the gap contracts.
|
||||
We do *not* read the gap pattern as evidence for a particular cut; it is reported here as cross-firm replication heterogeneity rather than as a selection criterion.
|
||||
The operating point at $0.95$ is therefore a defensible---not unique---selection in this neighbourhood, motivated by (i) keeping Firm A capture above $90\%$ on the operational dual rule, (ii) achieving an FAR of $0.0005$ at which marginal further savings from tightening are small relative to the corresponding capture loss, and (iii) preserving the interpretive transparency of the whole-sample Firm A P7.5 reading.
|
||||
It is *not* derived from the threshold-estimator outputs of Section IV-D, which the data do not support as classifier thresholds.
|
||||
|
||||
The paper therefore retains cos $> 0.95$ as the primary operational cut and reports the 0.945 result of Table XII as a sensitivity check rather than as a deployed alternative; downstream document-level rates (Table XVII) and intra-report agreement (Table XVI) are robust to moderate cutoff shifts within the 0.945--0.95 neighbourhood as long as the same cutoff is applied uniformly across firms.
|
||||
|
||||
## G. Additional Firm A Benchmark Validation
|
||||
|
||||
Before presenting the three threshold-robust analyses, Fig. 4 summarises the per-firm yearly per-signature best-match cosine distribution that motivates them.
|
||||
The left panel reports the mean per-signature best-match cosine within each firm bucket and fiscal year (a threshold-free statistic); the right panel reports the share of each firm-bucket-year with per-signature best-match cosine $\geq 0.95$ (the operational cut of Section III-K).
|
||||
Both panels show Firm A above the other Big-4 firms in every year of the 2013-2023 sample, with non-Big-4 firms below all four Big-4 firms throughout, and the cross-firm ordering is stable across the sample period.
|
||||
The mean-cosine separation between Firm A and the other Big-4 firms is on the order of 0.02-0.04 throughout the sample (e.g., 2013: Firm A $0.9733$ vs Firm B $0.9498$, Firm C $0.9464$, Firm D $0.9395$, Non-Big-4 $0.9227$; 2023: $0.9860$ vs $0.9668$, $0.9662$, $0.9525$, $0.9346$); the share-above-0.95 separation is wider (2013: Firm A $87.2\%$ vs $61.8\%$, $56.2\%$, $38.5\%$, $27.5\%$).
|
||||
This visual is the most direct cross-firm evidence in the paper that Firm A's high-similarity behaviour is firm-specific rather than corpus-wide; the three subsections below decompose this gap along three threshold-free or threshold-robust dimensions.
|
||||
|
||||
<!-- FIGURE 4: Per-firm yearly per-signature best-match cosine
|
||||
File: reports/figures/fig_yearly_big4_comparison.png (and .pdf)
|
||||
Generated by: signature_analysis/30_yearly_big4_comparison.py
|
||||
Caption: Per-firm yearly per-signature best-match cosine, 2013-2023.
|
||||
(a) Mean per-signature best-match cosine by firm bucket and fiscal year
|
||||
(threshold-free). (b) Share of per-signature best-match cosine $\geq 0.95$
|
||||
(operational cut of Section III-K). Five lines: Firm A, B, C, D, Non-Big-4.
|
||||
Firm A is above the other Big-4 firms in every year; Non-Big-4 is below all
|
||||
four Big-4 firms in every year. Per-firm signature counts and exact values
|
||||
are in `reports/firm_yearly_comparison/firm_yearly_comparison.{json,md}`.
|
||||
-->
|
||||
|
||||
The capture rates of Section IV-E are an *internal* consistency check: they ask "how much of Firm A does our threshold capture?", but the threshold was itself derived from Firm A's percentiles, so a high capture rate is not surprising.
|
||||
To go beyond this circular check, we report three further analyses, each chosen so that the *informative quantity* does not depend on the threshold's absolute value:
|
||||
|
||||
@@ -277,33 +354,38 @@ We test this prediction directly.
|
||||
For each auditor-year (CPA $\times$ fiscal year) with at least 5 signatures we compute the mean best-match cosine similarity across the year's signatures, yielding 4,629 auditor-years across 2013-2023.
|
||||
Firm A accounts for 1,287 of these (27.8% baseline share).
|
||||
Table XIV reports per-firm occupancy of the top $K\%$ of the ranked distribution.
|
||||
The per-signature best-match cosine underlying each auditor-year mean is taken over the full same-CPA pool (Section III-G) and may match against signatures from other fiscal years, so the auditor-year mean reflects the year's signatures' position within the CPA's full-sample similarity structure rather than purely within-year similarity; a within-year-restricted sensitivity replication is a natural robustness check and is left to future work.
|
||||
The per-signature best-match cosine underlying each auditor-year mean is taken over the full same-CPA pool (Section III-G), consistent with the unit-of-analysis framing in Section III-G.
|
||||
|
||||
<!-- TABLE XIV: Top-K Similarity Rank Occupancy by Firm (pooled 2013-2023)
|
||||
| Top-K | k in bucket | Firm A | Firm B | Firm C | Firm D | Non-Big-4 | Firm A share |
|
||||
|-------|-------------|--------|--------|--------|--------|-----------|--------------|
|
||||
| 10% | 462 | 443 | 2 | 3 | 0 | 14 | 95.9% |
|
||||
| 20% | 925 | 877 | 9 | 14 | 2 | 23 | 94.8% |
|
||||
| 25% | 1,157 | 1,043 | 32 | 23 | 9 | 50 | 90.1% |
|
||||
| 30% | 1,388 | 1,129 | 105 | 52 | 25 | 77 | 81.3% |
|
||||
| 50% | 2,314 | 1,220 | 473 | 273 | 102 | 246 | 52.7% |
|
||||
-->
|
||||
|
||||
Firm A occupies 95.9% of the top 10% and 90.1% of the top 25% of auditor-years by similarity, against its baseline share of 27.8%---a concentration ratio of 3.5$\times$ at the top decile and 3.2$\times$ at the top quartile.
|
||||
Firm A occupies 95.9% of the top 10%, 94.8% of the top 20%, 90.1% of the top 25%, and 81.3% of the top 30% of auditor-years by similarity, against its baseline share of 27.8%---a concentration ratio of $3.5\times$ at the top decile, $3.4\times$ at the top quintile, and $2.9\times$ at the top tercile.
|
||||
Firm A's share decays monotonically as the bracket widens (95.9% $\to$ 94.8% $\to$ 90.1% $\to$ 81.3% $\to$ 52.7% across top-10/20/25/30/50%), and only at the top 50% does its share approach its baseline; the over-representation is therefore concentrated in the very top of the distribution rather than spread uniformly through the upper half.
|
||||
Year-by-year (Table XV), the top-10% Firm A share ranges from 88.4% (2020) to 100% (2013, 2014, 2017, 2018, 2019), showing that the concentration is stable across the sample period.
|
||||
|
||||
<!-- TABLE XV: Firm A Share of Top-10% Similarity by Year
|
||||
| Year | N auditor-years | Top-10% k | Firm A in top-10% | Firm A share | Firm A baseline |
|
||||
|------|-----------------|-----------|-------------------|--------------|-----------------|
|
||||
| 2013 | 324 | 32 | 32 | 100.0% | 32.4% |
|
||||
| 2014 | 399 | 39 | 39 | 100.0% | 27.8% |
|
||||
| 2015 | 394 | 39 | 38 | 97.4% | 27.7% |
|
||||
| 2016 | 413 | 41 | 39 | 95.1% | 26.2% |
|
||||
| 2017 | 415 | 41 | 41 | 100.0% | 27.2% |
|
||||
| 2018 | 434 | 43 | 43 | 100.0% | 26.5% |
|
||||
| 2019 | 429 | 42 | 42 | 100.0% | 27.0% |
|
||||
| 2020 | 430 | 43 | 38 | 88.4% | 27.7% |
|
||||
| 2021 | 450 | 45 | 44 | 97.8% | 28.7% |
|
||||
| 2022 | 467 | 46 | 43 | 93.5% | 28.3% |
|
||||
| 2023 | 474 | 47 | 46 | 97.9% | 27.4% |
|
||||
<!-- TABLE XV: Firm A Share of Top-K Similarity by Year (K = 10%, 20%, 30%)
|
||||
| Year | N auditor-years | Top-10% share | Top-20% share | Top-30% share | Firm A baseline |
|
||||
|------|-----------------|---------------|---------------|---------------|-----------------|
|
||||
| 2013 | 324 | 100.0% (32/32) | 98.4% (63/64) | 89.7% (87/97) | 32.4% |
|
||||
| 2014 | 399 | 100.0% (39/39) | 98.7% (78/79) | 82.4% (98/119) | 27.8% |
|
||||
| 2015 | 394 | 97.4% (38/39) | 96.2% (75/78) | 84.7% (100/118) | 27.7% |
|
||||
| 2016 | 413 | 95.1% (39/41) | 96.3% (79/82) | 81.3% (100/123) | 26.2% |
|
||||
| 2017 | 415 | 100.0% (41/41) | 97.6% (81/83) | 83.9% (104/124) | 27.2% |
|
||||
| 2018 | 434 | 100.0% (43/43) | 97.7% (84/86) | 80.0% (104/130) | 26.5% |
|
||||
| 2019 | 429 | 100.0% (42/42) | 97.6% (83/85) | 78.9% (101/128) | 27.0% |
|
||||
| 2020 | 430 | 88.4% (38/43) | 91.9% (79/86) | 76.0% (98/129) | 27.7% |
|
||||
| 2021 | 450 | 97.8% (44/45) | 96.7% (87/90) | 81.5% (110/135) | 28.7% |
|
||||
| 2022 | 467 | 93.5% (43/46) | 95.7% (89/93) | 84.3% (118/140) | 28.3% |
|
||||
| 2023 | 474 | 97.9% (46/47) | 94.7% (89/94) | 83.8% (119/142) | 27.4% |
|
||||
|
||||
Per-cell entries are "share (k_FirmA / k_total)". Top-25% and top-50% pooled values are reported in Table XIV; per-year top-25/50 columns are omitted from this table to reduce visual width but are reproducible from the supplementary materials.
|
||||
-->
|
||||
|
||||
This over-representation is consistent with firm-wide non-hand-signing practice at Firm A and is not derived from any threshold we subsequently calibrate.
|
||||
@@ -339,8 +421,7 @@ We note that this test uses the calibrated classifier of Section III-K rather th
|
||||
|
||||
## H. Classification Results
|
||||
|
||||
Table XVII presents the final classification results under the dual-descriptor framework with Firm A-calibrated thresholds for 84,386 documents.
|
||||
The document count (84,386) differs from the 85,042 documents with any YOLO detection (Table III) because 656 documents have no signature whose extracted handwriting could be matched to a registered CPA name (every such signature has `assigned_accountant IS NULL` in the database, typically because the auditor's report page deviates from the standard two-signature layout or the OCRed printed CPA name was not present in the registry); the per-document classifier requires at least one CPA-matched signature so that a same-CPA best-match similarity exists, and these documents are therefore excluded from the classification reported here.
|
||||
Table XVII presents the final classification results under the dual-descriptor framework with Firm A-calibrated thresholds for 84,386 documents (656 documents excluded from the 85,042-document YOLO-detection cohort because no signature on the document could be matched to a registered CPA; see Table XVII note).
|
||||
We emphasize that the document-level proportions below reflect the *worst-case aggregation rule* of Section III-K: a report carrying one stamped signature and one hand-signed signature is labeled with the most-replication-consistent of the two signature-level verdicts.
|
||||
Document-level rates therefore represent the share of reports in which *at least one* signature is non-hand-signed rather than the share in which *both* are; the intra-report agreement analysis of Section IV-G.3 (Table XVI) reports how frequently the two co-signers share the same signature-level label within each firm, so that readers can judge what fraction of the non-hand-signed document-level share corresponds to fully non-hand-signed reports versus mixed reports.
|
||||
|
||||
@@ -354,6 +435,7 @@ Document-level rates therefore represent the share of reports in which *at least
|
||||
| Likely hand-signed | 47 | 0.1% | 4 | 0.0% |
|
||||
|
||||
Per the worst-case aggregation rule of Section III-K, a document with two signatures inherits the most-replication-consistent of the two signature-level labels.
|
||||
The 84,386-document cohort excludes 656 documents (relative to the 85,042 YOLO-detected cohort of Table III) for which no signature could be matched to a registered CPA: the per-document classifier requires at least one CPA-matched signature so that a same-CPA best-match similarity is defined. The exclusion is definitional rather than discretionary; typical causes are auditor's-report-page formats deviating from the standard two-signature layout, or OCR returning a printed CPA name not present in the registry.
|
||||
-->
|
||||
|
||||
Within the 71,656 documents exceeding cosine $0.95$, the dHash dimension stratifies them into three distinct populations:
|
||||
@@ -366,7 +448,7 @@ A cosine-only classifier would treat all 71,656 identically; the dual-descriptor
|
||||
|
||||
96.9% of Firm A's documents fall into the high- or moderate-confidence non-hand-signed categories, 0.6% into high-style-consistency, and 2.5% into uncertain.
|
||||
This pattern is consistent with the replication-dominated framing: the large majority is captured by non-hand-signed rules, while the small residual is consistent with the within-firm heterogeneity implied by the dip-test-confirmed unimodal-long-tail shape of Firm A's per-signature cosine distribution (Section IV-D.1) and the 7.5% signature-level left tail (Section III-H).
|
||||
The near-zero "likely hand-signed" rate (4 of 30,226 Firm A documents, 0.013%; the 30,226 count here is documents with at least one Firm A signer under the 84,386-document classification cohort, which differs from the 30,222 single-firm two-signer subset in Table XVI by 4 reports) indicates that the within-firm heterogeneity implied by the 7.5% signature-level left tail (Section IV-D) does not project into the lowest-cosine document-level category under the dual-descriptor rules; it is absorbed instead into the uncertain or high-style-consistency categories at this threshold set.
|
||||
The near-zero "likely hand-signed" rate (4 of 30,226 Firm A documents, 0.013%; the 30,226 denominator is documents with at least one Firm A signer under the 84,386-document classification cohort, which differs from the 30,222 single-firm two-signer subset of Table XVI by 4 mixed-firm reports excluded from the firm-level intra-report comparison) indicates that the within-firm heterogeneity implied by the 7.5% signature-level left tail (Section IV-D) does not project into the lowest-cosine document-level category under the dual-descriptor rules; it is absorbed instead into the uncertain or high-style-consistency categories at this threshold set.
|
||||
We note that because the non-hand-signed thresholds are themselves calibrated to Firm A's empirical percentiles (Section III-H), these rates are an internal consistency check rather than an external validation; the held-out Firm A validation of Section IV-F.2 is the corresponding external check.
|
||||
|
||||
### 2) Cross-Firm Comparison of Dual-Descriptor Convergence
|
||||
@@ -374,7 +456,7 @@ We note that because the non-hand-signed thresholds are themselves calibrated to
|
||||
Among the 65,514 non-Firm-A signatures with per-signature best-match cosine $> 0.95$, 42.12% have $\text{dHash}_\text{indep} \leq 5$, compared to 88.32% of the 55,922 Firm A signatures meeting the same cosine condition---a $\sim 2.1\times$ difference that the structural-verification layer makes visible.
|
||||
The Firm A denominator (55,922) matches Table IX exactly: both Table IX and the cross-firm decomposition define Firm A membership via the CPA registry (`accountants.firm`), and the cross-firm analysis additionally requires a non-null independent-min dHash record, which all 55,922 Firm A cosine-eligible signatures have in the current database.
|
||||
This cross-firm gap is consistent with firm-wide non-hand-signing practice at Firm A versus partner-specific or per-engagement replication at other firms; it complements the partner-level ranking (Section IV-G.2) and intra-report consistency (Section IV-G.3) findings.
|
||||
Counts and percentages are reproduced by `signature_analysis/28_byte_identity_decomposition.py` and reported in `reports/byte_identity_decomp/byte_identity_decomposition.json` (see Appendix B for the table-to-script provenance map).
|
||||
Reproduction artifact for these counts is listed in Appendix B.
|
||||
|
||||
## I. Ablation Study: Feature Backbone Comparison
|
||||
|
||||
|
||||
Reference in New Issue
Block a user