Commit Graph

9 Commits

Author SHA1 Message Date
gbanyan d3ddf746f4 Apply Phase 5 round-4 fixes from Opus round-2 N1-N4
Closes the substantive net-new findings Opus round-2 surfaced. All
fixes are structural or disclosure improvements; no empirical
content changes.

N1 — Denominator inconsistency disclosure: §IV-M.4 per-firm D2 ICCR
   listing (line 325) now explains the $n = 19{,}501$ Firm C
   denominator versus §IV-J Table XIX's single-firm-only $19{,}122$.
   The 379 mixed-firm PDFs all resolve to Firm C under Script 45's
   mode-of-firms (majority firm) tie-break — empirically Firm C is
   the majority firm in every mixed-firm PDF, not a tie-break
   artefact. Footnote reconciles both totals (75,233 vs 74,854).

N2 — §III-M validation table completeness: composition-decomposition
   diagnostic (§III-I.4; Scripts 39b–39e) — the foundational v4
   evidence cited in Abstract / §I item 4 / §VI item 1 — added as
   the first row of the §III-M validation table. Updated:
   - §I item 8 (Phase 4 line 57): "nine partial-evidence
     diagnostics" → "ten partial-evidence diagnostics (§III-M
     Table XXVII)"
   - §VI item 8 (Phase 4 line 147): "nine-tool unsupervised-
     validation collection (§III-M)" → "ten-tool unsupervised-
     validation collection (§III-M Table XXVII)"
   - Phase 4 internal draft note still says "nine-tool" but is
     internal-strip-at-splice; deliberately not edited.

N3 — Table number assigned: §III-M validation table is now
   Table XXVII (continues sequential numbering after §IV-M.6's
   Table XXVI). Caption: "Ten-tool unsupervised-validation
   collection with disclosed untested assumptions."

N4 — Cross-firm hit matrix assumption row rewritten: replaced the
   "None — direct descriptive observation" understatement with the
   actual dependency disclosure — same-pair joint event yields
   97.0–99.96% within-firm at all four firms versus any-pair
   76.7–98.8% — plus the §IV-M.4 mode-of-firms tie-break
   cross-reference.

Net result: all three substantive Opus round-2 net-new findings
plus N4 closed. N5 (firm-dependent within-firm violation in §V-H)
and N6 (§IV-I stub cross-reference) deferred as low-priority
optional copy-edits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 17:49:39 +08:00
gbanyan 4a6f9c5c98 Apply Phase 5 round-3 splice-blocker fixes from codex round-8
Closes the three concrete splice blockers codex round-8 surfaced
in the post-round-2 drafts, plus the binary-collapse terminology
residue. No empirical changes.

- Abstract trimmed 261 -> 247 words (3 under IEEE Access <=250
  target). Cut "technically trivial and visually invisible,"
  (S1 motivational redundancy) and the within-firm-rate
  parenthetical "(Firm A 98.8%; Firms B/C/D 76.7-83.7%)" plus
  "between" connector; preserved the corrected 77-99% any-pair
  headline so the M3 substance survives.

- §IV-J Table XV sample-size footnote (line 177) corrected:
  round-2 misclassified §IV-M.5 as descriptor-complete n=150,442;
  Script 44 / Tables XXIV-XXV actually use vector-complete
  n=150,453, same as §IV-M.2 Table XXI (Script 40b) and §IV-M.3
  Table XXII (Script 43). New footnote distinguishes
  descriptor-complete (§IV-D through §IV-J) from
  vector/pair-recomputed (§IV-M.2/M.3/M.5; Scripts 40b/43/44).

- §IV-I (line 161) stale cross-reference: "§IV-M Table XVI" was
  the K=3 firm cross-tab (descriptive), not the v4-new ICCR
  calibration. Replaced with "§IV-M Tables XXI-XXVI" — the full
  ICCR calibration block. Pre-existing error exposed by the
  round-2 cascade.

- §III line 131 + §IV Table XI line 104 binary-collapse label:
  "replicated vs not-replicated" -> "replication-dominated vs
  less-replication-dominated" for consistency with the K=3
  descriptor-position framing. "Replicated class" preserved
  where it refers to byte-identical positive-anchor ground
  truth (§III-K.4, §IV-H lines 143/153/155).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 17:17:30 +08:00
gbanyan b884d39544 Apply Phase 5 round-2 fixes from Opus M1-M4 + Gemini Table XV footnote
Addresses round-1 findings from all three AI reviewers in a single
pass. Substantive empirical content unchanged; fixes are factual
corrections, terminology consistency, and table-numbering hygiene.

Opus M3 (Abstract-level factual misstatement): "98-100% of inter-CPA
collisions within source firm" repeated in Abstract / §I body / §I
item 6 / §V-C / §V-G limitation 2 / §VI item 4 / §VI Future Work
conflated the same-pair joint rate (97.0-99.96%) with the any-pair
deployed rule rate (76.7-98.8% across Firms A/B/C/D — Firm A 98.8,
B 76.7, C 83.7, D 77.4 from Table XXV). Replaced with the actual
any-pair range and explicit same-pair sub-range. Removed §V-C's
"regardless of which Big-4 firm is the source" — within-firm
concentration is firm-dependent.

Opus M1 (§IV K=3 mechanism-label reversion): §IV silently regressed
to v3.x "C1 hand-leaning / C2 mixed / C3 replicated" naming that
§III-J line 90 explicitly retires post-composition-decomposition.
Replaced in Tables IX/X/XIV/XVI/XVII column headers and §IV-F /
§IV-H / §IV-J / §IV-K prose. New convention matches §III-J:
- C1 (hand-leaning) -> C1 (low-cos / high-dHash)
- C2 (mixed) -> C2 (central)
- C3 (replicated) -> C3 (high-cos / low-dHash)
- "hand-leaning rate" -> "less-replication-dominated rate"
"Replicated class" retained where it refers to byte-identical
ground truth (line 143/153 — actual byte-level reuse, not K=3
mechanism inference).

Opus M4 (§V duplicate G heading): Phase 4 prose §V had "G.
Pixel-Identity..." at line 105 and "G. Limitations" at line 109.
Renamed second heading to "H. Limitations".

Opus M2 + Gemini Table XV-B (table-numbering cascade): Renamed
Table XV-B to Table XIX, then cascaded XIX -> XX -> ... -> XXV ->
XXVI to keep sequential integer numbering. Cross-reference at
§IV-J also updated. No cross-refs to these tables exist outside §IV
(verified by grep against §III + Phase 4 prose).

Gemini sample-size footnote (Table XV): expanded the source note
to explicitly explain the 150,442 (descriptor-complete) vs 150,453
(vector-complete) distinction across §IV sub-sections and point
back to §III-G sample-size reconciliation.

§III prose softening (lines 99, 283): "nearly all (98%)" framing
that read the Firm A rate as representative of all four Big-4 firms
replaced with the per-firm any-pair / same-pair breakdown.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 16:57:19 +08:00
gbanyan 9604b273c0 Apply codex round-7 Phase 5 copy-edit fixes + refresh STATE.md
Mechanical copy-edit closing the OPEN/PARTIAL items from
paper/codex_review_gpt55_v4_round7.md; substantive empirical
content unchanged. Manuscript-splice items (strip internal draft
notes, update stale abstract-count note) deferred to final splice.

- Phase 4 prose §V-G + §III-K methodology: "candidate classifiers"
  -> "candidate checks" (closes round-7 m13 + Spot-check 3 wording leak)
- Phase 4 prose §II: remove placeholder caveat sentence at the LOOO
  paragraph (closes round-7 M6 + A4)
- References v3: add [42] Stone 1974, [43] Geisser 1975, [44] Vehtari
  et al. 2017 (44 entries; was 41) — backs the §II LOOO addition
- Round-7 review: add row-count clarification note (11 Major / 15
  Minor labelled rows vs. the prompt's 9/12 tally)
- STATE.md: refresh from stale Phase-2 snapshot to current Phase 5
  status — Phases 1-4 complete; codex rounds 1-7 closed at Minor
  Revision; pending Gemini + Opus rounds + round-2/3 convergence

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 14:21:59 +08:00
gbanyan b33e20d479 Rewrite Phase 4 prose v3: Abstract / §I / §V / §VI to match §III v7
Major Phase 4 prose update aligning narrative with the §III v7
anchor-based ICCR framework (codex rounds 29-34):

- Abstract (247 words, under 250 limit): replaced K=3 mixture +
  natural-threshold framing with composition decomposition +
  multi-level ICCR + firm heterogeneity. Positioning as
  specificity-proxy-anchored screening framework.

- §I Introduction:
  * Methodological-design paragraph rewritten (no natural threshold;
    multi-level reporting; per-firm stratification; unsupervised
    disclosure)
  * Two new paragraphs documenting composition decomposition
    overturning distributional path, and anchor-based three-unit
    ICCR calibration
  * Firm heterogeneity + within-firm collision concentration as
    central findings
  * Contribution list rewritten (8 items): composition decomposition
    disproves natural threshold (NEW #4); multi-level ICCR
    calibration (NEW #5); firm heterogeneity quantification (NEW #6);
    K=3 demoted to descriptive partition (#7); multi-tool validation
    ceiling positioning (#8)

- §V Discussion:
  * §V-B retitled "composition-driven multimodality"; 2x2 factorial
    decomposition reported
  * §V-C Firm A reframed: position contrast + within-firm collision
    pattern, not "templated-end calibration anchor"
  * §V-D K=2/K=3 reframed as descriptive firm-compositional
    partitions (no "mechanism boundary" language)
  * §V-E three-score convergence reinterpreted as descriptor-position
    ranking, not hand-leaning mechanism ranking
  * §V-F (new title) Anchor-based multi-level calibration with all
    three units of analysis
  * §V-G expanded to 9 v4-specific limitations (no signature-level
    ground truth; assumption-violation; scope; conservative-subset;
    inherited rule components; deployed-rate excess not TPR; A1
    stipulation; K=3 composition sensitivity; no partner-level
    mechanism attribution) plus 5 inherited limitations

- §VI Conclusion: 8-point contribution list mirroring §I; 4 future
  work directions including within-firm collision-mechanism
  disambiguation and audit-quality companion analysis.

- Header draft-note updated to v3 (post codex rounds 26-34);
  Phase 4 v2 changelog moved to CHANGELOG.md placeholder.

Companion to §III v7 commit 723a3f6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:10:04 +08:00
gbanyan 6db5d635f5 Apply codex round-27 narrow fixes; Phase 4 prose v2.1
Codex round 27 returned Minor Revision: 10/11 Major + 14/15 Minor
CLOSED. Two narrow residuals applied:

  1. §V-F line 99 'all three candidate classifiers' replaced with
     'all three candidate checks' with explicit enumeration
     (the inherited box rule, the K=3 hard label, and the
     prevalence-calibrated reverse-anchor cut). Keeps the K=3
     hard label explicitly descriptive rather than operational.

  2. Close-out checklist's stale '~235 words' abstract claim
     updated to the verified 243-244 word count.

Deferred to manuscript-assembly time (not blockers for Phase 5
cross-AI peer review):
  - §II [42]-[44] citation finalisation (placeholders are
    transparent in the current draft state).
  - Internal draft notes and close-out checklists (these
    explicitly help reviewers track the convergence cycle).
  - Manuscript-level lint pass (last step before submission
    packaging).

Closure summary across 7 codex rounds (21-27):
  - Empirical: ALL Major + Minor findings CLOSED on the
    §III/§IV/Phase 4 substantive content.
  - Packaging: 2 OPEN items (§II citations, internal notes)
    intentionally deferred to manuscript-assembly time.

Phase 5 readiness: substantively YES. The §III v6 + §IV v3.2 +
Phase 4 v2.1 is converged for cross-AI peer review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EOF
2026-05-13 00:15:35 +08:00
gbanyan 918d55154a Abstract trim: 253 -> 245 words (within IEEE Access 250-word target)
Six minor edits to reduce word count:
- 'a YOLOv11 detector localizes signatures' -> 'YOLOv11 localizes
  signatures'
- 'filed in Taiwan over 2013-2023' -> 'Taiwan audit reports
  (2013-2023)'
- 'statistical analysis is scoped to the Big-4 sub-corpus
  (437 CPAs, 150,442 signatures)' -> 'analysis is scoped to the
  Big-4 sub-corpus (437 CPAs; 150,442 signatures)'
- 'Wilson 95% upper bound 1.45%' -> 'Wilson upper bound 1.45%'
- 'cross-scope check (n = 686) preserves the K=3 + box-rule
  Spearman convergence with drift 0.007' -> 'check (n = 686)
  preserves the K=3 + box-rule Spearman convergence (drift
  0.007)'

All numerical anchors preserved. Phase 4 prose v2 now within
IEEE Access 250-word abstract limit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EOF
2026-05-12 23:57:01 +08:00
gbanyan 10c82fd446 Apply codex round-26 corrections to Phase 4 prose v2
Codex round 26 returned Major Revision on Phase 4 v1: 9 Major
findings + 12 Minor + reviewer-attack vulnerabilities. v2
applies all flagged corrections.

Abstract changes:
  - "Three independent feature-derived scores" -> "Three
    feature-derived scores ... not statistically independent
    because all three are functions of the same descriptor
    pair". Names the operational output as the inherited
    five-way classifier.
  - Trimmed from 277 to ~245 words to stay within IEEE Access
    250-word limit while keeping all numerical anchors.

§I Introduction:
  - Line 29 cross-ref §III-D -> §III-G through §III-J
    (§III-D was wrong; the methodology lives in §III-G/I/J).
  - Big-4 scope claim narrowed: "neither any single firm pooled
    alone nor the broader full-dataset variant rejects" -> "none
    of the narrower comparison scopes tested in Script 32
    rejects" with explicit enumeration (Firm A pooled alone;
    Firms B+C+D pooled; all non-Firm-A pooled).
  - "Three independent feature-derived scores" -> "Three
    feature-derived scores ... not statistically independent".
  - Contribution 4 "not at narrower scopes" -> "not in the
    narrower comparison scopes tested".
  - Contribution 8 "demonstrating pipeline reproducibility at
    multiple scopes" -> narrowed to "K=3 + box-rule
    rank-convergence reproduces at full n=686; does not
    re-validate operational thresholds / LOOO / five-way / pixel
    identity at the broader scope".
  - "external validation" softened to "annotation-free
    validation" in methodological-safeguards paragraph.
  - "(5)–(8)" pipeline stage list updated with corrected
    section references.
  - "Published box rule" -> "inherited Paper A box rule".
  - Added Big-4 pixel-identity per-firm breakdown (145/8/107/2)
    in §I body for completeness.

§II Related Work:
  - Replaced placeholder with explicit defer-to-master statement:
    v3.20.0 §II is inherited substantively unchanged in the master
    manuscript; only the LOOO addition is reproduced here.
  - "[add citation]" replaced with placeholder references
    [42] Stone 1974, [43] Geisser 1975, [44] Vehtari et al. 2017
    explicitly marked as draft references to be finalised at
    copy-edit time.
  - LOOO addition reframed: composition-sensitivity band on the
    mixture characterisation, not on the operational classifier.

§V Discussion:
  - §V-B "v4.0 inherits and confirms" softened to "v4.0 inherits
    this signature-level reading and remains consistent with it
    (no signature-level diagnostic was newly run in v4)".
  - §V-B "some CPAs are templated, some are hand-leaning, some
    are mixed" rewritten as component-membership wording: "some
    CPAs' observed signatures place their per-CPA means in the
    templated/mixed/hand-leaning region of the descriptor plane".
  - §V-B within-CPA unimodality explanation softened from
    "produces" to "can be jointly consistent" with explicit
    §III-G cross-ref.
  - §V-C Firm A byte-level provenance: 145 pixel-identical
    signatures verified in Script 40; 50 partners / 35 cross-year
    explicitly inherited from v3 / Script 28 not regenerated in
    v4 spikes.
  - §V-C "anchors §IV-H's positive-anchor miss-rate" -> "is the
    largest of the four Big-4 subsets, with full anchor pooling
    Firm A 145, Firm B 8, Firm C 107, Firm D 2".
  - §V-E "published box rule" -> "inherited Paper A box rule";
    "produce the same per-CPA ranking" -> "broadly concordant
    rankings, with residual non-Firm-A disagreement".
  - §V-G limitations expanded from 7 to 12 items: restored the
    5 v3.20.0 inherited limitations (transferred ImageNet
    features, HSV stamp-removal artifacts, longitudinal scan
    confounds, source-exemplar misattribution, legal
    interpretation).
  - §V-G scope limitation: removed unsupported "narrower or
    broader scopes" full-dataset dip-test claim.

§VI Conclusion:
  - Names operational output: "inherited Paper A five-way
    per-signature classifier with worst-case document-level
    aggregation".
  - "Cross-scope pipeline reproducibility" -> "K=3 + box-rule
    rank-convergence reproduces at full n=686; does not
    re-validate operational thresholds, LOOO, five-way classifier,
    or pixel-identity at the broader scope".
  - Future-work direction 3 explicitly qualifies the within-Big-4
    contrast as "accountant-level descriptive features of the K=3
    mixture, not validated mechanism-level claims and not
    currently linked to audit-quality outcomes".

Round 26 closure post-v2:
  - All 9 Major findings: CLOSED in v2 prose body.
  - All 12 Minor findings: CLOSED in v2 prose body.
  - Phase 5 readiness: should now move from Partial to Yes
    pending codex round 27 verification.

Provenance: codex round-26 confirmed 17/17 numerical claims in
Phase 4 v1 (only finding #5, the scope-test wording, was an
overclaim rather than a numerical error). v2 keeps all confirmed
numerics and narrows only the scope-test wording.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 23:50:09 +08:00
gbanyan e36c49d2d8 Add Phase 4 prose draft v1 (Abstract + I + II + V + VI)
Phase 4 first-pass draft replacing the v3.20.0 Abstract,
§I Introduction, §II Related Work, §V Discussion, and §VI
Conclusion blocks with the Big-4 reframed v4.0 prose. Single
consolidated file at paper/v4/paper_a_prose_v4_phase4.md.

Structure:
  Abstract  (~235 words, IEEE Access target <= 250)
  §I Introduction  (8-item contributions list updated for v4)
  §II Related Work  (mostly inherited; LOOO citation added)
  §V Discussion  (7 sub-sections: A-G covering distinct-problem
                  framing, accountant-level multimodality,
                  Firm A as templated-end case study, K=2
                  firm-mass conflation, K=3 reproducible shape,
                  three-score internal-consistency, pixel-
                  identity + inter-CPA validation, limitations)
  §VI Conclusion + Future Work  (4 future directions)

Key reframing decisions baked into the prose:
  - Abstract leads with Big-4 scope + dip-test multimodality +
    K=3 reproducibility + three-score convergence + 0% miss
    rate + full-dataset robustness.
  - §I positions the Big-4 sub-corpus scope as the
    methodologically privileged calibration unit ("smallest
    tested scope at which a finite-mixture model is
    statistically supportable").
  - §I-Contribution-4: Big-4 scope as substantive methodological
    finding (was v3.x "percentile-anchored operational
    threshold").
  - §I-Contribution-5: K=3 mixture as descriptive (was v3.x
    "distributional characterisation" framing).
  - §I-Contribution-6: three-score convergent internal-
    consistency (NEW in v4).
  - §I-Contribution-8: full-dataset robustness as light
    secondary scope (NEW in v4).
  - §V-D: explicit "K=2 is firm-mass driven; K=3 is
    reproducible in shape" framing — preempts the LOOO
    reviewer attack vector codex round 23 first flagged.
  - §V-G Limitations: seven explicit limitations including no
    signature-level hand-signed ground truth, pixel-identity
    conservative subset, MC band not separately v4-validated.
  - §VI Future Work: four directions including a Paper B
    placeholder for audit-quality companion analysis.

The technical §III v6 + §IV v3.2 are the foundation; this Phase
4 draft aligns the narrative with the codex-converged
methodology and results.

6 close-out items flagged at end of file (word-count check,
contribution count, LOOO citation, limitations grouping, Paper B
cross-ref, draft note stripping).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 22:46:19 +08:00