Files
pdf_signature_extraction/paper
gbanyan ce33156238 Apply codex round-23 corrections: §IV v3 + §III v4
Codex round 23 returned Major Revision on §IV v2: 6 Major + 6
Minor + 5 Editorial findings. Codex confirmed the spike-script
provenance is mostly sound -- no scripts needed rerunning -- so
v3 applies presentation-level fixes only.

Decisions baked in:
  - Anonymisation: maintain Firm A-D pseudonyms throughout the
    manuscript body; remove (Deloitte) / (KPMG) / (PwC) / (EY)
    parentheticals from all v4 §IV tables.
  - Table numbering: v4 tables use fresh V-XVIII (plus Table XV-B);
    inherited v3.x tables are cited only as "v3.20.0 Table N" with
    the original v3 number, NOT renumbered into the v4 sequence.

§IV v3 changes:
  1. Detection denominator rewritten: 86,072 VLM-positive / 12
     corrupted / 86,071 YOLO-processed / 85,042 with-detections /
     182,328 signatures (matches v3.x §IV-B exact wording).
  2. All v4 table labels stripped of "(revised:" / "(NEW:"
     prefixes; replaced with clean "Table N. <descriptor>." form.
  3. Real firm names removed from all tables: 4 replace_all edits.
  4. Line 211 MC-ordering claim removed: MC occupancy is no longer
     described as "consistent with the §III-K Spearman convergence"
     because MC fraction is not monotone in per-CPA hand-leaning
     ranking. New language: descriptive only, with Firm D / Firm B
     ordering counterexample stated.
  5. Line 184 81.70% vs 82.46% qualified as "qualitative
     alignment, not like-for-like consistency check" (different
     units: per-signature class vs per-CPA hard cluster).
  6. Line 43 BD-transition "histogram-resolution artefacts"
     softened to "scope-dependent and not used operationally";
     no specific bin-width artefact claim without sensitivity
     sweep evidence.
  7. K=3 LOOO C1 weight drift corrected: 0.025 -> 0.023 (matches
     Script 37 max deviation 0.0235 / rounded 0.023).
  8. Seed coverage in §IV-A updated: "Scripts 32-42" (was
     "Scripts 32-41", missed Script 42).
  9. Low-cosine cutoff inclusivity: cos < 0.837 -> cos <= 0.837
     (matches Script 42 rule definition).
  10. "round-22 Light scope" process note removed from
      manuscript prose in §IV-K.
  11. §IV-L ablation pointer corrected: v3.20.0 §IV-I (was
      §IV-H.3); v3.20.0 Table XVIII clarified as different from
      v4 Table XVIII.
  12. Line 75 "Component recovery verified across Scripts 35,
      37, 38" rewritten: "the full-fit baseline is reproduced
      in Scripts 35, 37, 38" with explicit note that Script 37
      LOOO fold-specific components differ by design.
  13. Line 110 grammar: "This convergent-checks evidence" ->
      "These convergence checks".
  14. Draft note marked "internal -- remove before submission".

§III v4 changes (cross-reference cleanup):
  1. Line 13 cross-reference repaired: "§IV-D, §IV-F, §IV-G"
     (which are now accountant-level v4 analyses) replaced with
     accurate signature-level references (§IV-J for five-way
     counts; §IV-I for inherited inter-CPA FAR).
  2. Line 23 cross-reference repaired: "all §IV results except
     §IV-K" replaced with explicit list of v4-new vs inherited
     sub-sections.
  3. Line 109 cross-reference repaired: moderate-band capture-
     rate evidence cited as "v3.20.0 Tables IX, XI, XII, XII-B"
     (was "§IV-F", which is now Convergent Internal-Consistency
     Checks, not capture-rate).
  4. Line 131 "without recalibration" claim narrowed: §III-K's
     convergent-checks evidence is now scoped to the binary
     high-confidence rule only; the moderate-confidence band,
     style-consistency band, and document-level aggregation
     are retained by reference to v3.20.0 calibration, not
     claimed as v4.0-validated.

Outstanding open questions: 3 procedural items remain (§IV
table numbering finalisation, §IV-A-C content audit, Phase 4
prose); no methodology blockers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:03:33 +08:00
..