From 128a91433f3a9908cc9cd00d6951da92af214405 Mon Sep 17 00:00:00 2001
From: gbanyan <gbanyan.huang@gmail.com>
Date: Thu, 14 May 2026 18:02:35 +0800
Subject: [PATCH] Apply Phase 5 round-5 provenance patches from codex round-9
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Closes the two factual / provenance issues codex round-9 caught in
the round-4 fixes. Text-only patches; no script reruns.

Patch A — N1 wording corrected: §IV-M.4 line 325 had said the 379
mixed-firm PDFs "resolve to Firm C as the majority firm" (propagated
from Opus round-2's incorrect inference from reading the Script 45
source). Codex DB-verified all 379 are actually 1:1 Firm C / Firm D
ties, assigned to Firm C only because `np.argmax` over `np.unique`'s
alphabetically-sorted firm counts returns the first-sorted firm on
ties. Corrected to the actual tie-break explanation.

Patch B — N2 Table XXVII row 1 narrowed: composition-decomposition
row's untested-assumption cell previously claimed "within-firm dip
tests on every firm with n >= 500 (Script 39c) corroborate absence
of within-population bimodality." Codex verified Script 39c on raw
dHash actually REJECTS unimodality in all 10 firms (integer ties);
only Big-4 per-firm jittered (Script 39d) and Big-4 pooled
centred+jittered (Script 39e) are emitted. Narrowed to those two
diagnostics — no overreach to non-Big-4 jittered evidence.

Patch C — §III line 59 + provenance table line 382: replaced the
unreproducible $[0.71, 1.00]$ non-Big-4 jittered-dHash range with
codex's read-only verified range $[0.38, 1.00]$, attributed as a
"codex-verified read-only spike on Script 39c substrate." The
qualitative claim (0/10 non-Big-4 firms reject after jitter) is
preserved and confirmed by codex's independent rerun; only the
specific manuscript range was unverifiable from the committed
script reports.

Verification:
- `rg -n "majority firm |nine-tool|9 tools"` in paper/v4/ returns
  0 matches in published prose; only 2 matches in internal
  strip-at-splice text (Phase 4 draft note + §III internal
  checklist).
- All Script 39c citations now technically accurate (cosine for
  per-firm; codex-verified for jittered-dHash spike).
- Abstract still 247 words.

Phase 5 convergence: 3/3 reviewers in Accept/Minor band remains
intact. With these factual corrections applied, the manuscript text
is now consistent with the committed script outputs. Remaining
work: splice-time strip of internal notes / checklists, then
proceed to Phase 6 partner Jimmy review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 paper/v4/paper_a_methodology_v4_section_iii.md | 6 +++---
 paper/v4/paper_a_results_v4_section_iv.md      | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/paper/v4/paper_a_methodology_v4_section_iii.md b/paper/v4/paper_a_methodology_v4_section_iii.md
index 16eeddd..f0d5f39 100644
--- a/paper/v4/paper_a_methodology_v4_section_iii.md
+++ b/paper/v4/paper_a_methodology_v4_section_iii.md
@@ -56,7 +56,7 @@ This section characterises the joint distribution of accountant-level descriptor
 
 *Within-firm signature-level dip (Scripts 39b, 39c).* Repeating the dip test at the signature level inside each individual Big-4 firm (Script 39b) and inside each individual non-Big-4 firm with $\geq 500$ signatures (Script 39c) yields a consistent picture. The cosine marginal *fails* to reject unimodality in every single firm tested — all four Big-4 firms ($p_{\text{cos}} \in \{0.176, 0.991, 0.551, 0.976\}$ for Firms A through D; Script 39b) and ten non-Big-4 firms with $\geq 500$ signatures ($p_{\text{cos}} \in [0.59, 0.99]$; Script 39c). The raw dHash marginal *does* reject unimodality in every firm tested ($p < 5 \times 10^{-4}$ in all $14$ firms), but the raw dHash values are integer-valued in $\{0, 1, \ldots, 64\}$, leaving open the possibility of an integer-tie artefact.
 
-*Integer-jitter robustness (Scripts 39d, 39e).* Adding independent uniform jitter $\sim \mathrm{U}[-0.5, +0.5]$ to break exact dHash ties and re-running the dip test on the perturbed signature cloud (5 seeds, $n_{\text{boot}} = 2000$; Script 39d) eliminates the dHash within-firm rejection in every Big-4 firm tested (Firm A jittered $p_{\text{median}} = 0.999$; B $0.996$; C $0.999$; D $0.9995$; $0$/$5$ seeds reject at $\alpha = 0.05$ in any firm). All ten non-Big-4 firms similarly fail to reject after jitter ($p \in [0.71, 1.00]$). The pooled-Big-4 dHash dip *does* survive jitter alone ($p_{\text{median}} = 0$, $5$/$5$ seeds reject), but Firm A's mean dHash ($2.73$) is substantially below Firms B/C/D's ($6.46$, $7.39$, $7.21$) — a between-firm location shift. Script 39e applies a 2 \times 2 factorial correction (firm-mean centring $\times$ integer jitter) on the Big-4 pooled dHash:
+*Integer-jitter robustness (Scripts 39d, 39e).* Adding independent uniform jitter $\sim \mathrm{U}[-0.5, +0.5]$ to break exact dHash ties and re-running the dip test on the perturbed signature cloud (5 seeds, $n_{\text{boot}} = 2000$; Script 39d) eliminates the dHash within-firm rejection in every Big-4 firm tested (Firm A jittered $p_{\text{median}} = 0.999$; B $0.996$; C $0.999$; D $0.9995$; $0$/$5$ seeds reject at $\alpha = 0.05$ in any firm). A codex-verified read-only spike applying the same jitter procedure to the ten non-Big-4 firms with $\geq 500$ signatures (Script 39c substrate) likewise yields no rejection ($0$/$10$ firms reject at $\alpha = 0.05$; per-firm median-$p$ range $[0.38, 1.00]$). The pooled-Big-4 dHash dip *does* survive jitter alone ($p_{\text{median}} = 0$, $5$/$5$ seeds reject), but Firm A's mean dHash ($2.73$) is substantially below Firms B/C/D's ($6.46$, $7.39$, $7.21$) — a between-firm location shift. Script 39e applies a 2 \times 2 factorial correction (firm-mean centring $\times$ integer jitter) on the Big-4 pooled dHash:
 
 | Condition | Firm-mean centred | Integer jitter | Median dip $p$ |  Reject at $\alpha = 0.05$ |
 |---|---|---|---|---|
@@ -319,7 +319,7 @@ In place of supervised validation, v4.0 adopts a **multi-tool collection of part
 
 | Tool | What it measures | Untested assumption |
 |---|---|---|
-| Composition decomposition (§III-I.4; Scripts 39b–39e) | Whether descriptor multimodality is within-population (mechanism) or between-group (composition + integer artefact); $p_{\text{median}} = 0.35$ under joint firm-mean centring + integer-tie jitter | Integer-tie jitter is unbiased over the descriptor support; within-firm dip tests on every firm with $n \geq 500$ (Script 39c) corroborate the absence of within-population bimodality |
+| Composition decomposition (§III-I.4; Scripts 39b–39e) | Whether descriptor multimodality is within-population (mechanism) or between-group (composition + integer artefact); $p_{\text{median}} = 0.35$ under joint firm-mean centring + integer-tie jitter | Integer-tie jitter and firm-mean centring are unbiased over the descriptor support; corroborated by Big-4 per-firm jitter (Script 39d; per-firm dHash rejection disappears under jitter at every Big-4 firm) and Big-4 pooled centred + jittered ($n_{\text{seeds}} = 5$; Script 39e) |
 | Per-comparison inter-CPA coincidence rate (§III-L.1; Script 40b) | Pair-level specificity proxy under a random-pair negative anchor | Inter-CPA pairs are negative (i.e., not template-related); partially violated by within-firm sharing (§III-L.4) |
 | Pool-normalised per-signature ICCR (§III-L.2; Script 43) | Deployed-rule specificity proxy at per-signature unit, accounting for pool size | Same as above + that pool replacement preserves the negative-anchor property |
 | Document-level ICCR (§III-L.3; Script 45) | Operational alarm rate proxy at per-document unit under three alarm definitions | Same as above |
@@ -379,7 +379,7 @@ The table below lists the principal numerical claims and their data-source scrip
 | Within-firm signature-level dip $p_{\text{cos}}$ Big-4 (A/B/C/D) | $0.176 / 0.991 / 0.551 / 0.976$ | Script 39b per-firm | direct, $n_{\text{boot}} = 2000$ |
 | Within-firm signature-level dip $p_{\text{cos}}$ non-Big-4 (10 firms, range) | $[0.59, 0.99]$ | Script 39c per-firm | direct, firms with $\geq 500$ signatures |
 | Within-firm jittered-dHash dip $p$ Big-4 (5 seeds, median) A/B/C/D | $0.999 / 0.996 / 0.999 / 0.9995$ | Script 39d multi-seed | uniform jitter $[-0.5, +0.5]$ |
-| Within-firm jittered-dHash dip $p$ non-Big-4 (5 seeds, range across 10 firms) | $[0.71, 1.00]$ | Script 39d / 39c | uniform jitter $[-0.5, +0.5]$ |
+| Within-firm jittered-dHash dip $p$ non-Big-4 (5 seeds, range across 10 firms) | $[0.38, 1.00]$ ($0/10$ firms reject) | codex-verified read-only spike on Script 39c substrate | uniform jitter $[-0.5, +0.5]$ |
 | Big-4 pooled dHash dip $p$ raw / jittered (seed median) | $< 5 \times 10^{-4}$ / $< 5 \times 10^{-4}$ | Script 39d | jitter alone does not eliminate Big-4 pooled rejection |
 | Big-4 pooled dHash dip $p$ firm-centred + jittered (5-seed median) | $0.35$ | Script 39e 2×2 factorial | both corrections eliminate rejection ($0/5$ seeds at $\alpha = 0.05$) |
 | Big-4 firm-centred signature-level cos dip $p$ | $0.597$ | codex round-30 verification on Script 43 substrate | independent verification |
diff --git a/paper/v4/paper_a_results_v4_section_iv.md b/paper/v4/paper_a_results_v4_section_iv.md
index f3173ab..13e896a 100644
--- a/paper/v4/paper_a_results_v4_section_iv.md
+++ b/paper/v4/paper_a_results_v4_section_iv.md
@@ -322,7 +322,7 @@ Decile trend is broadly monotone in pool size with two minor reversals (decile 5
 | D2 (operational) | HC + MC | $0.3375$ | $[0.3342, 0.3409]$ |
 | D3 | HC + MC + HSC | $0.3384$ | $[0.3351, 0.3418]$ |
 
-Per-firm D2 document-level ICCR: Firm A $0.6201$ ($n = 30{,}226$); Firm B $0.1600$ ($n = 17{,}127$); Firm C $0.1635$ ($n = 19{,}501$); Firm D $0.0863$ ($n = 8{,}379$). The Firm C denominator $n = 19{,}501$ exceeds Table XIX's single-firm Firm C count of $19{,}122$ by exactly the $379$ mixed-firm PDFs: Script 45 assigns each multi-firm PDF to its mode-of-firms (majority firm), and all $379$ mixed-firm PDFs in the Big-4 sub-corpus resolve to Firm C as the majority firm. The four per-firm denominators here therefore sum to the full $75{,}233$, whereas Table XIX's per-firm rows sum to $74{,}854 = 75{,}233 - 379$.
+Per-firm D2 document-level ICCR: Firm A $0.6201$ ($n = 30{,}226$); Firm B $0.1600$ ($n = 17{,}127$); Firm C $0.1635$ ($n = 19{,}501$); Firm D $0.0863$ ($n = 8{,}379$). The Firm C denominator $n = 19{,}501$ exceeds Table XIX's single-firm Firm C count of $19{,}122$ by exactly the $379$ mixed-firm PDFs: all $379$ are $1{:}1$ Firm C / Firm D mixed-firm documents, and Script 45's mode-of-firms implementation (`np.argmax` over `np.unique`'s alphabetically-sorted firm counts) returns the first-sorted firm on ties, which assigns these tied documents to Firm C rather than to Firm D. The four per-firm denominators here therefore sum to the full $75{,}233$, whereas Table XIX's per-firm rows sum to $74{,}854 = 75{,}233 - 379$.
 
 ### M.5 Firm heterogeneity logistic regression and cross-firm hit matrix (Script 44)