Phase 6 round-4 codex-review fixes: blocker + 2 majors + minors
Codex round-3 verification (commit 9e68f2e) flagged remaining items:
BLOCKER:
- Appendix A Table A.I was inside an HTML comment but visible prose at L1268
and L1275 referenced it as if it rendered. Un-commented Table A.I (same fix
pattern as Tables I-IV in round-3).
MAJOR:
- Table XXVII (diagnostic summary) appeared in §III-M at L593 before Tables
III-XXVI in §IV, breaking sequential IEEE-style numbering. Moved the table
to Appendix A as Table A.II (new §A.2 "Diagnostic Summary"). §III-M now
references "Appendix A Table A.II" instead of hosting the table inline;
§VI Conclusion contribution (8) updated similarly. Appendix A heading
generalised to "Supplementary Diagnostic Detail" with §A.1 (BD/McCrary)
and §A.2 (Diagnostic Summary).
- §III-M L614 rate-definition conflation: the sentence "per-signature and
per-document rates ($0.11$ and $0.34$ respectively under the deployed
any-pair HC + MC alarm)" mixed unit labels. Rewrote to label each rate by
its actual unit and source table: per-signature any-pair HC ICCR ($0.11$;
Table XXII) and per-document HC+MC alarm-rate ICCR ($0.34$; Table XXIII).
MINORS:
- L400 §III-L stale ref retargeted: "operational signature-level classifier
of §III-L" -> "of §III-H.1 calibrated in §III-L".
- L663 §III-L stale ref retargeted: "KDE crossover used in per-document
classification (Section III-L)" -> "(Section III-H.1)".
- L1146 Conclusion "corpus-universal" overstated generality -> "across the
tested eligible scopes" (non-Big-4 evidence is cosine-axis only, not full
calibration scope).
- L1024 Results §IV-M.4 footnote: np.argmax / np.unique implementation
detail moved to "alphabetically-ordered tie-break ... full implementation
detail in the supplementary materials" (less script-internal prose).
Artefacts:
- Combined manuscript regenerated: paper_a_v4_combined.md, 1316 lines.
- Final main-text table sequence: I, II, III, ..., XXVI (26 tables, all
sequential in rendered order). Appendix tables: A.I, A.II.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -32,7 +32,7 @@ The Big-4 subset of the detection output yields 150,442 signatures with both des
|
||||
## C. All-Pairs Intra-vs-Inter Class Distribution Analysis
|
||||
|
||||
Fig. 2 presents the cosine similarity distributions computed over the full set of *pairwise comparisons* under two groupings: intra-class (all signature pairs belonging to the same CPA) and inter-class (signature pairs from different CPAs).
|
||||
This all-pairs analysis is a different unit from the per-signature best-match statistics used in Sections IV-D onward; we report it first because it supplies the reference point for the KDE crossover used in per-document classification (Section III-L).
|
||||
This all-pairs analysis is a different unit from the per-signature best-match statistics used in Sections IV-D onward; we report it first because it supplies the reference point for the KDE crossover used in per-signature classification (Section III-H.1).
|
||||
Table IV summarizes the distributional statistics.
|
||||
|
||||
**Table IV.** Cosine Similarity Distribution Statistics.
|
||||
@@ -393,7 +393,7 @@ Decile trend is broadly monotone in pool size with two minor reversals (decile 5
|
||||
| D2 (operational) | HC + MC | $0.3375$ | $[0.3342, 0.3409]$ |
|
||||
| D3 | HC + MC + HSC | $0.3384$ | $[0.3351, 0.3418]$ |
|
||||
|
||||
Per-firm D2 document-level ICCR: Firm A $0.6201$ ($n = 30{,}226$); Firm B $0.1600$ ($n = 17{,}127$); Firm C $0.1635$ ($n = 19{,}501$); Firm D $0.0863$ ($n = 8{,}379$). The Firm C denominator $n = 19{,}501$ exceeds Table XVI's single-firm Firm C count of $19{,}122$ by exactly the $379$ mixed-firm PDFs: all $379$ are $1{:}1$ Firm C / Firm D mixed-firm documents, and Script 45's mode-of-firms implementation (`np.argmax` over `np.unique`'s alphabetically-sorted firm counts) returns the first-sorted firm on ties, which assigns these tied documents to Firm C rather than to Firm D. The four per-firm denominators here therefore sum to the full $75{,}233$, whereas Table XVI's per-firm rows sum to $74{,}854 = 75{,}233 - 379$.
|
||||
Per-firm D2 document-level ICCR: Firm A $0.6201$ ($n = 30{,}226$); Firm B $0.1600$ ($n = 17{,}127$); Firm C $0.1635$ ($n = 19{,}501$); Firm D $0.0863$ ($n = 8{,}379$). The Firm C denominator $n = 19{,}501$ exceeds Table XVI's single-firm Firm C count of $19{,}122$ by exactly the $379$ mixed-firm PDFs (all $379$ are $1{:}1$ Firm C / Firm D documents that an alphabetically-ordered tie-break assigns to Firm C; full implementation detail in the supplementary materials). The four per-firm denominators here therefore sum to the full $75{,}233$, whereas Table XVI's per-firm rows sum to $74{,}854 = 75{,}233 - 379$.
|
||||
|
||||
### M.5 Firm heterogeneity logistic regression and cross-firm hit matrix (Script 44)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user