From 128a91433f3a9908cc9cd00d6951da92af214405 Mon Sep 17 00:00:00 2001 From: gbanyan Date: Thu, 14 May 2026 18:02:35 +0800 Subject: [PATCH] Apply Phase 5 round-5 provenance patches from codex round-9 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the two factual / provenance issues codex round-9 caught in the round-4 fixes. Text-only patches; no script reruns. Patch A — N1 wording corrected: §IV-M.4 line 325 had said the 379 mixed-firm PDFs "resolve to Firm C as the majority firm" (propagated from Opus round-2's incorrect inference from reading the Script 45 source). Codex DB-verified all 379 are actually 1:1 Firm C / Firm D ties, assigned to Firm C only because `np.argmax` over `np.unique`'s alphabetically-sorted firm counts returns the first-sorted firm on ties. Corrected to the actual tie-break explanation. Patch B — N2 Table XXVII row 1 narrowed: composition-decomposition row's untested-assumption cell previously claimed "within-firm dip tests on every firm with n >= 500 (Script 39c) corroborate absence of within-population bimodality." Codex verified Script 39c on raw dHash actually REJECTS unimodality in all 10 firms (integer ties); only Big-4 per-firm jittered (Script 39d) and Big-4 pooled centred+jittered (Script 39e) are emitted. Narrowed to those two diagnostics — no overreach to non-Big-4 jittered evidence. Patch C — §III line 59 + provenance table line 382: replaced the unreproducible $[0.71, 1.00]$ non-Big-4 jittered-dHash range with codex's read-only verified range $[0.38, 1.00]$, attributed as a "codex-verified read-only spike on Script 39c substrate." The qualitative claim (0/10 non-Big-4 firms reject after jitter) is preserved and confirmed by codex's independent rerun; only the specific manuscript range was unverifiable from the committed script reports. Verification: - `rg -n "majority firm |nine-tool|9 tools"` in paper/v4/ returns 0 matches in published prose; only 2 matches in internal strip-at-splice text (Phase 4 draft note + §III internal checklist). - All Script 39c citations now technically accurate (cosine for per-firm; codex-verified for jittered-dHash spike). - Abstract still 247 words. Phase 5 convergence: 3/3 reviewers in Accept/Minor band remains intact. With these factual corrections applied, the manuscript text is now consistent with the committed script outputs. Remaining work: splice-time strip of internal notes / checklists, then proceed to Phase 6 partner Jimmy review. Co-Authored-By: Claude Opus 4.7 (1M context) --- paper/v4/paper_a_methodology_v4_section_iii.md | 6 +++--- paper/v4/paper_a_results_v4_section_iv.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/paper/v4/paper_a_methodology_v4_section_iii.md b/paper/v4/paper_a_methodology_v4_section_iii.md index 16eeddd..f0d5f39 100644 --- a/paper/v4/paper_a_methodology_v4_section_iii.md +++ b/paper/v4/paper_a_methodology_v4_section_iii.md @@ -56,7 +56,7 @@ This section characterises the joint distribution of accountant-level descriptor *Within-firm signature-level dip (Scripts 39b, 39c).* Repeating the dip test at the signature level inside each individual Big-4 firm (Script 39b) and inside each individual non-Big-4 firm with $\geq 500$ signatures (Script 39c) yields a consistent picture. The cosine marginal *fails* to reject unimodality in every single firm tested — all four Big-4 firms ($p_{\text{cos}} \in \{0.176, 0.991, 0.551, 0.976\}$ for Firms A through D; Script 39b) and ten non-Big-4 firms with $\geq 500$ signatures ($p_{\text{cos}} \in [0.59, 0.99]$; Script 39c). The raw dHash marginal *does* reject unimodality in every firm tested ($p < 5 \times 10^{-4}$ in all $14$ firms), but the raw dHash values are integer-valued in $\{0, 1, \ldots, 64\}$, leaving open the possibility of an integer-tie artefact. -*Integer-jitter robustness (Scripts 39d, 39e).* Adding independent uniform jitter $\sim \mathrm{U}[-0.5, +0.5]$ to break exact dHash ties and re-running the dip test on the perturbed signature cloud (5 seeds, $n_{\text{boot}} = 2000$; Script 39d) eliminates the dHash within-firm rejection in every Big-4 firm tested (Firm A jittered $p_{\text{median}} = 0.999$; B $0.996$; C $0.999$; D $0.9995$; $0$/$5$ seeds reject at $\alpha = 0.05$ in any firm). All ten non-Big-4 firms similarly fail to reject after jitter ($p \in [0.71, 1.00]$). The pooled-Big-4 dHash dip *does* survive jitter alone ($p_{\text{median}} = 0$, $5$/$5$ seeds reject), but Firm A's mean dHash ($2.73$) is substantially below Firms B/C/D's ($6.46$, $7.39$, $7.21$) — a between-firm location shift. Script 39e applies a 2 \times 2 factorial correction (firm-mean centring $\times$ integer jitter) on the Big-4 pooled dHash: +*Integer-jitter robustness (Scripts 39d, 39e).* Adding independent uniform jitter $\sim \mathrm{U}[-0.5, +0.5]$ to break exact dHash ties and re-running the dip test on the perturbed signature cloud (5 seeds, $n_{\text{boot}} = 2000$; Script 39d) eliminates the dHash within-firm rejection in every Big-4 firm tested (Firm A jittered $p_{\text{median}} = 0.999$; B $0.996$; C $0.999$; D $0.9995$; $0$/$5$ seeds reject at $\alpha = 0.05$ in any firm). A codex-verified read-only spike applying the same jitter procedure to the ten non-Big-4 firms with $\geq 500$ signatures (Script 39c substrate) likewise yields no rejection ($0$/$10$ firms reject at $\alpha = 0.05$; per-firm median-$p$ range $[0.38, 1.00]$). The pooled-Big-4 dHash dip *does* survive jitter alone ($p_{\text{median}} = 0$, $5$/$5$ seeds reject), but Firm A's mean dHash ($2.73$) is substantially below Firms B/C/D's ($6.46$, $7.39$, $7.21$) — a between-firm location shift. Script 39e applies a 2 \times 2 factorial correction (firm-mean centring $\times$ integer jitter) on the Big-4 pooled dHash: | Condition | Firm-mean centred | Integer jitter | Median dip $p$ | Reject at $\alpha = 0.05$ | |---|---|---|---|---| @@ -319,7 +319,7 @@ In place of supervised validation, v4.0 adopts a **multi-tool collection of part | Tool | What it measures | Untested assumption | |---|---|---| -| Composition decomposition (§III-I.4; Scripts 39b–39e) | Whether descriptor multimodality is within-population (mechanism) or between-group (composition + integer artefact); $p_{\text{median}} = 0.35$ under joint firm-mean centring + integer-tie jitter | Integer-tie jitter is unbiased over the descriptor support; within-firm dip tests on every firm with $n \geq 500$ (Script 39c) corroborate the absence of within-population bimodality | +| Composition decomposition (§III-I.4; Scripts 39b–39e) | Whether descriptor multimodality is within-population (mechanism) or between-group (composition + integer artefact); $p_{\text{median}} = 0.35$ under joint firm-mean centring + integer-tie jitter | Integer-tie jitter and firm-mean centring are unbiased over the descriptor support; corroborated by Big-4 per-firm jitter (Script 39d; per-firm dHash rejection disappears under jitter at every Big-4 firm) and Big-4 pooled centred + jittered ($n_{\text{seeds}} = 5$; Script 39e) | | Per-comparison inter-CPA coincidence rate (§III-L.1; Script 40b) | Pair-level specificity proxy under a random-pair negative anchor | Inter-CPA pairs are negative (i.e., not template-related); partially violated by within-firm sharing (§III-L.4) | | Pool-normalised per-signature ICCR (§III-L.2; Script 43) | Deployed-rule specificity proxy at per-signature unit, accounting for pool size | Same as above + that pool replacement preserves the negative-anchor property | | Document-level ICCR (§III-L.3; Script 45) | Operational alarm rate proxy at per-document unit under three alarm definitions | Same as above | @@ -379,7 +379,7 @@ The table below lists the principal numerical claims and their data-source scrip | Within-firm signature-level dip $p_{\text{cos}}$ Big-4 (A/B/C/D) | $0.176 / 0.991 / 0.551 / 0.976$ | Script 39b per-firm | direct, $n_{\text{boot}} = 2000$ | | Within-firm signature-level dip $p_{\text{cos}}$ non-Big-4 (10 firms, range) | $[0.59, 0.99]$ | Script 39c per-firm | direct, firms with $\geq 500$ signatures | | Within-firm jittered-dHash dip $p$ Big-4 (5 seeds, median) A/B/C/D | $0.999 / 0.996 / 0.999 / 0.9995$ | Script 39d multi-seed | uniform jitter $[-0.5, +0.5]$ | -| Within-firm jittered-dHash dip $p$ non-Big-4 (5 seeds, range across 10 firms) | $[0.71, 1.00]$ | Script 39d / 39c | uniform jitter $[-0.5, +0.5]$ | +| Within-firm jittered-dHash dip $p$ non-Big-4 (5 seeds, range across 10 firms) | $[0.38, 1.00]$ ($0/10$ firms reject) | codex-verified read-only spike on Script 39c substrate | uniform jitter $[-0.5, +0.5]$ | | Big-4 pooled dHash dip $p$ raw / jittered (seed median) | $< 5 \times 10^{-4}$ / $< 5 \times 10^{-4}$ | Script 39d | jitter alone does not eliminate Big-4 pooled rejection | | Big-4 pooled dHash dip $p$ firm-centred + jittered (5-seed median) | $0.35$ | Script 39e 2×2 factorial | both corrections eliminate rejection ($0/5$ seeds at $\alpha = 0.05$) | | Big-4 firm-centred signature-level cos dip $p$ | $0.597$ | codex round-30 verification on Script 43 substrate | independent verification | diff --git a/paper/v4/paper_a_results_v4_section_iv.md b/paper/v4/paper_a_results_v4_section_iv.md index f3173ab..13e896a 100644 --- a/paper/v4/paper_a_results_v4_section_iv.md +++ b/paper/v4/paper_a_results_v4_section_iv.md @@ -322,7 +322,7 @@ Decile trend is broadly monotone in pool size with two minor reversals (decile 5 | D2 (operational) | HC + MC | $0.3375$ | $[0.3342, 0.3409]$ | | D3 | HC + MC + HSC | $0.3384$ | $[0.3351, 0.3418]$ | -Per-firm D2 document-level ICCR: Firm A $0.6201$ ($n = 30{,}226$); Firm B $0.1600$ ($n = 17{,}127$); Firm C $0.1635$ ($n = 19{,}501$); Firm D $0.0863$ ($n = 8{,}379$). The Firm C denominator $n = 19{,}501$ exceeds Table XIX's single-firm Firm C count of $19{,}122$ by exactly the $379$ mixed-firm PDFs: Script 45 assigns each multi-firm PDF to its mode-of-firms (majority firm), and all $379$ mixed-firm PDFs in the Big-4 sub-corpus resolve to Firm C as the majority firm. The four per-firm denominators here therefore sum to the full $75{,}233$, whereas Table XIX's per-firm rows sum to $74{,}854 = 75{,}233 - 379$. +Per-firm D2 document-level ICCR: Firm A $0.6201$ ($n = 30{,}226$); Firm B $0.1600$ ($n = 17{,}127$); Firm C $0.1635$ ($n = 19{,}501$); Firm D $0.0863$ ($n = 8{,}379$). The Firm C denominator $n = 19{,}501$ exceeds Table XIX's single-firm Firm C count of $19{,}122$ by exactly the $379$ mixed-firm PDFs: all $379$ are $1{:}1$ Firm C / Firm D mixed-firm documents, and Script 45's mode-of-firms implementation (`np.argmax` over `np.unique`'s alphabetically-sorted firm counts) returns the first-sorted firm on ties, which assigns these tied documents to Firm C rather than to Firm D. The four per-firm denominators here therefore sum to the full $75{,}233$, whereas Table XIX's per-firm rows sum to $74{,}854 = 75{,}233 - 379$. ### M.5 Firm heterogeneity logistic regression and cross-firm hit matrix (Script 44)