128a91433f3a9908cc9cd00d6951da92af214405
9 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
d3ddf746f4 |
Apply Phase 5 round-4 fixes from Opus round-2 N1-N4
Closes the substantive net-new findings Opus round-2 surfaced. All
fixes are structural or disclosure improvements; no empirical
content changes.
N1 — Denominator inconsistency disclosure: §IV-M.4 per-firm D2 ICCR
listing (line 325) now explains the $n = 19{,}501$ Firm C
denominator versus §IV-J Table XIX's single-firm-only $19{,}122$.
The 379 mixed-firm PDFs all resolve to Firm C under Script 45's
mode-of-firms (majority firm) tie-break — empirically Firm C is
the majority firm in every mixed-firm PDF, not a tie-break
artefact. Footnote reconciles both totals (75,233 vs 74,854).
N2 — §III-M validation table completeness: composition-decomposition
diagnostic (§III-I.4; Scripts 39b–39e) — the foundational v4
evidence cited in Abstract / §I item 4 / §VI item 1 — added as
the first row of the §III-M validation table. Updated:
- §I item 8 (Phase 4 line 57): "nine partial-evidence
diagnostics" → "ten partial-evidence diagnostics (§III-M
Table XXVII)"
- §VI item 8 (Phase 4 line 147): "nine-tool unsupervised-
validation collection (§III-M)" → "ten-tool unsupervised-
validation collection (§III-M Table XXVII)"
- Phase 4 internal draft note still says "nine-tool" but is
internal-strip-at-splice; deliberately not edited.
N3 — Table number assigned: §III-M validation table is now
Table XXVII (continues sequential numbering after §IV-M.6's
Table XXVI). Caption: "Ten-tool unsupervised-validation
collection with disclosed untested assumptions."
N4 — Cross-firm hit matrix assumption row rewritten: replaced the
"None — direct descriptive observation" understatement with the
actual dependency disclosure — same-pair joint event yields
97.0–99.96% within-firm at all four firms versus any-pair
76.7–98.8% — plus the §IV-M.4 mode-of-firms tie-break
cross-reference.
Net result: all three substantive Opus round-2 net-new findings
plus N4 closed. N5 (firm-dependent within-firm violation in §V-H)
and N6 (§IV-I stub cross-reference) deferred as low-priority
optional copy-edits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4a6f9c5c98 |
Apply Phase 5 round-3 splice-blocker fixes from codex round-8
Closes the three concrete splice blockers codex round-8 surfaced in the post-round-2 drafts, plus the binary-collapse terminology residue. No empirical changes. - Abstract trimmed 261 -> 247 words (3 under IEEE Access <=250 target). Cut "technically trivial and visually invisible," (S1 motivational redundancy) and the within-firm-rate parenthetical "(Firm A 98.8%; Firms B/C/D 76.7-83.7%)" plus "between" connector; preserved the corrected 77-99% any-pair headline so the M3 substance survives. - §IV-J Table XV sample-size footnote (line 177) corrected: round-2 misclassified §IV-M.5 as descriptor-complete n=150,442; Script 44 / Tables XXIV-XXV actually use vector-complete n=150,453, same as §IV-M.2 Table XXI (Script 40b) and §IV-M.3 Table XXII (Script 43). New footnote distinguishes descriptor-complete (§IV-D through §IV-J) from vector/pair-recomputed (§IV-M.2/M.3/M.5; Scripts 40b/43/44). - §IV-I (line 161) stale cross-reference: "§IV-M Table XVI" was the K=3 firm cross-tab (descriptive), not the v4-new ICCR calibration. Replaced with "§IV-M Tables XXI-XXVI" — the full ICCR calibration block. Pre-existing error exposed by the round-2 cascade. - §III line 131 + §IV Table XI line 104 binary-collapse label: "replicated vs not-replicated" -> "replication-dominated vs less-replication-dominated" for consistency with the K=3 descriptor-position framing. "Replicated class" preserved where it refers to byte-identical positive-anchor ground truth (§III-K.4, §IV-H lines 143/153/155). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b884d39544 |
Apply Phase 5 round-2 fixes from Opus M1-M4 + Gemini Table XV footnote
Addresses round-1 findings from all three AI reviewers in a single pass. Substantive empirical content unchanged; fixes are factual corrections, terminology consistency, and table-numbering hygiene. Opus M3 (Abstract-level factual misstatement): "98-100% of inter-CPA collisions within source firm" repeated in Abstract / §I body / §I item 6 / §V-C / §V-G limitation 2 / §VI item 4 / §VI Future Work conflated the same-pair joint rate (97.0-99.96%) with the any-pair deployed rule rate (76.7-98.8% across Firms A/B/C/D — Firm A 98.8, B 76.7, C 83.7, D 77.4 from Table XXV). Replaced with the actual any-pair range and explicit same-pair sub-range. Removed §V-C's "regardless of which Big-4 firm is the source" — within-firm concentration is firm-dependent. Opus M1 (§IV K=3 mechanism-label reversion): §IV silently regressed to v3.x "C1 hand-leaning / C2 mixed / C3 replicated" naming that §III-J line 90 explicitly retires post-composition-decomposition. Replaced in Tables IX/X/XIV/XVI/XVII column headers and §IV-F / §IV-H / §IV-J / §IV-K prose. New convention matches §III-J: - C1 (hand-leaning) -> C1 (low-cos / high-dHash) - C2 (mixed) -> C2 (central) - C3 (replicated) -> C3 (high-cos / low-dHash) - "hand-leaning rate" -> "less-replication-dominated rate" "Replicated class" retained where it refers to byte-identical ground truth (line 143/153 — actual byte-level reuse, not K=3 mechanism inference). Opus M4 (§V duplicate G heading): Phase 4 prose §V had "G. Pixel-Identity..." at line 105 and "G. Limitations" at line 109. Renamed second heading to "H. Limitations". Opus M2 + Gemini Table XV-B (table-numbering cascade): Renamed Table XV-B to Table XIX, then cascaded XIX -> XX -> ... -> XXV -> XXVI to keep sequential integer numbering. Cross-reference at §IV-J also updated. No cross-refs to these tables exist outside §IV (verified by grep against §III + Phase 4 prose). Gemini sample-size footnote (Table XV): expanded the source note to explicitly explain the 150,442 (descriptor-complete) vs 150,453 (vector-complete) distinction across §IV sub-sections and point back to §III-G sample-size reconciliation. §III prose softening (lines 99, 283): "nearly all (98%)" framing that read the Firm A rate as representative of all four Big-4 firms replaced with the per-firm any-pair / same-pair breakdown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9604b273c0 |
Apply codex round-7 Phase 5 copy-edit fixes + refresh STATE.md
Mechanical copy-edit closing the OPEN/PARTIAL items from paper/codex_review_gpt55_v4_round7.md; substantive empirical content unchanged. Manuscript-splice items (strip internal draft notes, update stale abstract-count note) deferred to final splice. - Phase 4 prose §V-G + §III-K methodology: "candidate classifiers" -> "candidate checks" (closes round-7 m13 + Spot-check 3 wording leak) - Phase 4 prose §II: remove placeholder caveat sentence at the LOOO paragraph (closes round-7 M6 + A4) - References v3: add [42] Stone 1974, [43] Geisser 1975, [44] Vehtari et al. 2017 (44 entries; was 41) — backs the §II LOOO addition - Round-7 review: add row-count clarification note (11 Major / 15 Minor labelled rows vs. the prompt's 9/12 tally) - STATE.md: refresh from stale Phase-2 snapshot to current Phase 5 status — Phases 1-4 complete; codex rounds 1-7 closed at Minor Revision; pending Gemini + Opus rounds + round-2/3 convergence Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b33e20d479 |
Rewrite Phase 4 prose v3: Abstract / §I / §V / §VI to match §III v7
Major Phase 4 prose update aligning narrative with the §III v7
anchor-based ICCR framework (codex rounds 29-34):
- Abstract (247 words, under 250 limit): replaced K=3 mixture +
natural-threshold framing with composition decomposition +
multi-level ICCR + firm heterogeneity. Positioning as
specificity-proxy-anchored screening framework.
- §I Introduction:
* Methodological-design paragraph rewritten (no natural threshold;
multi-level reporting; per-firm stratification; unsupervised
disclosure)
* Two new paragraphs documenting composition decomposition
overturning distributional path, and anchor-based three-unit
ICCR calibration
* Firm heterogeneity + within-firm collision concentration as
central findings
* Contribution list rewritten (8 items): composition decomposition
disproves natural threshold (NEW #4); multi-level ICCR
calibration (NEW #5); firm heterogeneity quantification (NEW #6);
K=3 demoted to descriptive partition (#7); multi-tool validation
ceiling positioning (#8)
- §V Discussion:
* §V-B retitled "composition-driven multimodality"; 2x2 factorial
decomposition reported
* §V-C Firm A reframed: position contrast + within-firm collision
pattern, not "templated-end calibration anchor"
* §V-D K=2/K=3 reframed as descriptive firm-compositional
partitions (no "mechanism boundary" language)
* §V-E three-score convergence reinterpreted as descriptor-position
ranking, not hand-leaning mechanism ranking
* §V-F (new title) Anchor-based multi-level calibration with all
three units of analysis
* §V-G expanded to 9 v4-specific limitations (no signature-level
ground truth; assumption-violation; scope; conservative-subset;
inherited rule components; deployed-rate excess not TPR; A1
stipulation; K=3 composition sensitivity; no partner-level
mechanism attribution) plus 5 inherited limitations
- §VI Conclusion: 8-point contribution list mirroring §I; 4 future
work directions including within-firm collision-mechanism
disambiguation and audit-quality companion analysis.
- Header draft-note updated to v3 (post codex rounds 26-34);
Phase 4 v2 changelog moved to CHANGELOG.md placeholder.
Companion to §III v7 commit
|
||
|
|
6db5d635f5 |
Apply codex round-27 narrow fixes; Phase 4 prose v2.1
Codex round 27 returned Minor Revision: 10/11 Major + 14/15 Minor
CLOSED. Two narrow residuals applied:
1. §V-F line 99 'all three candidate classifiers' replaced with
'all three candidate checks' with explicit enumeration
(the inherited box rule, the K=3 hard label, and the
prevalence-calibrated reverse-anchor cut). Keeps the K=3
hard label explicitly descriptive rather than operational.
2. Close-out checklist's stale '~235 words' abstract claim
updated to the verified 243-244 word count.
Deferred to manuscript-assembly time (not blockers for Phase 5
cross-AI peer review):
- §II [42]-[44] citation finalisation (placeholders are
transparent in the current draft state).
- Internal draft notes and close-out checklists (these
explicitly help reviewers track the convergence cycle).
- Manuscript-level lint pass (last step before submission
packaging).
Closure summary across 7 codex rounds (21-27):
- Empirical: ALL Major + Minor findings CLOSED on the
§III/§IV/Phase 4 substantive content.
- Packaging: 2 OPEN items (§II citations, internal notes)
intentionally deferred to manuscript-assembly time.
Phase 5 readiness: substantively YES. The §III v6 + §IV v3.2 +
Phase 4 v2.1 is converged for cross-AI peer review.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EOF
|
||
|
|
918d55154a |
Abstract trim: 253 -> 245 words (within IEEE Access 250-word target)
Six minor edits to reduce word count: - 'a YOLOv11 detector localizes signatures' -> 'YOLOv11 localizes signatures' - 'filed in Taiwan over 2013-2023' -> 'Taiwan audit reports (2013-2023)' - 'statistical analysis is scoped to the Big-4 sub-corpus (437 CPAs, 150,442 signatures)' -> 'analysis is scoped to the Big-4 sub-corpus (437 CPAs; 150,442 signatures)' - 'Wilson 95% upper bound 1.45%' -> 'Wilson upper bound 1.45%' - 'cross-scope check (n = 686) preserves the K=3 + box-rule Spearman convergence with drift 0.007' -> 'check (n = 686) preserves the K=3 + box-rule Spearman convergence (drift 0.007)' All numerical anchors preserved. Phase 4 prose v2 now within IEEE Access 250-word abstract limit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> EOF |
||
|
|
10c82fd446 |
Apply codex round-26 corrections to Phase 4 prose v2
Codex round 26 returned Major Revision on Phase 4 v1: 9 Major
findings + 12 Minor + reviewer-attack vulnerabilities. v2
applies all flagged corrections.
Abstract changes:
- "Three independent feature-derived scores" -> "Three
feature-derived scores ... not statistically independent
because all three are functions of the same descriptor
pair". Names the operational output as the inherited
five-way classifier.
- Trimmed from 277 to ~245 words to stay within IEEE Access
250-word limit while keeping all numerical anchors.
§I Introduction:
- Line 29 cross-ref §III-D -> §III-G through §III-J
(§III-D was wrong; the methodology lives in §III-G/I/J).
- Big-4 scope claim narrowed: "neither any single firm pooled
alone nor the broader full-dataset variant rejects" -> "none
of the narrower comparison scopes tested in Script 32
rejects" with explicit enumeration (Firm A pooled alone;
Firms B+C+D pooled; all non-Firm-A pooled).
- "Three independent feature-derived scores" -> "Three
feature-derived scores ... not statistically independent".
- Contribution 4 "not at narrower scopes" -> "not in the
narrower comparison scopes tested".
- Contribution 8 "demonstrating pipeline reproducibility at
multiple scopes" -> narrowed to "K=3 + box-rule
rank-convergence reproduces at full n=686; does not
re-validate operational thresholds / LOOO / five-way / pixel
identity at the broader scope".
- "external validation" softened to "annotation-free
validation" in methodological-safeguards paragraph.
- "(5)–(8)" pipeline stage list updated with corrected
section references.
- "Published box rule" -> "inherited Paper A box rule".
- Added Big-4 pixel-identity per-firm breakdown (145/8/107/2)
in §I body for completeness.
§II Related Work:
- Replaced placeholder with explicit defer-to-master statement:
v3.20.0 §II is inherited substantively unchanged in the master
manuscript; only the LOOO addition is reproduced here.
- "[add citation]" replaced with placeholder references
[42] Stone 1974, [43] Geisser 1975, [44] Vehtari et al. 2017
explicitly marked as draft references to be finalised at
copy-edit time.
- LOOO addition reframed: composition-sensitivity band on the
mixture characterisation, not on the operational classifier.
§V Discussion:
- §V-B "v4.0 inherits and confirms" softened to "v4.0 inherits
this signature-level reading and remains consistent with it
(no signature-level diagnostic was newly run in v4)".
- §V-B "some CPAs are templated, some are hand-leaning, some
are mixed" rewritten as component-membership wording: "some
CPAs' observed signatures place their per-CPA means in the
templated/mixed/hand-leaning region of the descriptor plane".
- §V-B within-CPA unimodality explanation softened from
"produces" to "can be jointly consistent" with explicit
§III-G cross-ref.
- §V-C Firm A byte-level provenance: 145 pixel-identical
signatures verified in Script 40; 50 partners / 35 cross-year
explicitly inherited from v3 / Script 28 not regenerated in
v4 spikes.
- §V-C "anchors §IV-H's positive-anchor miss-rate" -> "is the
largest of the four Big-4 subsets, with full anchor pooling
Firm A 145, Firm B 8, Firm C 107, Firm D 2".
- §V-E "published box rule" -> "inherited Paper A box rule";
"produce the same per-CPA ranking" -> "broadly concordant
rankings, with residual non-Firm-A disagreement".
- §V-G limitations expanded from 7 to 12 items: restored the
5 v3.20.0 inherited limitations (transferred ImageNet
features, HSV stamp-removal artifacts, longitudinal scan
confounds, source-exemplar misattribution, legal
interpretation).
- §V-G scope limitation: removed unsupported "narrower or
broader scopes" full-dataset dip-test claim.
§VI Conclusion:
- Names operational output: "inherited Paper A five-way
per-signature classifier with worst-case document-level
aggregation".
- "Cross-scope pipeline reproducibility" -> "K=3 + box-rule
rank-convergence reproduces at full n=686; does not
re-validate operational thresholds, LOOO, five-way classifier,
or pixel-identity at the broader scope".
- Future-work direction 3 explicitly qualifies the within-Big-4
contrast as "accountant-level descriptive features of the K=3
mixture, not validated mechanism-level claims and not
currently linked to audit-quality outcomes".
Round 26 closure post-v2:
- All 9 Major findings: CLOSED in v2 prose body.
- All 12 Minor findings: CLOSED in v2 prose body.
- Phase 5 readiness: should now move from Partial to Yes
pending codex round 27 verification.
Provenance: codex round-26 confirmed 17/17 numerical claims in
Phase 4 v1 (only finding #5, the scope-test wording, was an
overclaim rather than a numerical error). v2 keeps all confirmed
numerics and narrows only the scope-test wording.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
e36c49d2d8 |
Add Phase 4 prose draft v1 (Abstract + I + II + V + VI)
Phase 4 first-pass draft replacing the v3.20.0 Abstract,
§I Introduction, §II Related Work, §V Discussion, and §VI
Conclusion blocks with the Big-4 reframed v4.0 prose. Single
consolidated file at paper/v4/paper_a_prose_v4_phase4.md.
Structure:
Abstract (~235 words, IEEE Access target <= 250)
§I Introduction (8-item contributions list updated for v4)
§II Related Work (mostly inherited; LOOO citation added)
§V Discussion (7 sub-sections: A-G covering distinct-problem
framing, accountant-level multimodality,
Firm A as templated-end case study, K=2
firm-mass conflation, K=3 reproducible shape,
three-score internal-consistency, pixel-
identity + inter-CPA validation, limitations)
§VI Conclusion + Future Work (4 future directions)
Key reframing decisions baked into the prose:
- Abstract leads with Big-4 scope + dip-test multimodality +
K=3 reproducibility + three-score convergence + 0% miss
rate + full-dataset robustness.
- §I positions the Big-4 sub-corpus scope as the
methodologically privileged calibration unit ("smallest
tested scope at which a finite-mixture model is
statistically supportable").
- §I-Contribution-4: Big-4 scope as substantive methodological
finding (was v3.x "percentile-anchored operational
threshold").
- §I-Contribution-5: K=3 mixture as descriptive (was v3.x
"distributional characterisation" framing).
- §I-Contribution-6: three-score convergent internal-
consistency (NEW in v4).
- §I-Contribution-8: full-dataset robustness as light
secondary scope (NEW in v4).
- §V-D: explicit "K=2 is firm-mass driven; K=3 is
reproducible in shape" framing — preempts the LOOO
reviewer attack vector codex round 23 first flagged.
- §V-G Limitations: seven explicit limitations including no
signature-level hand-signed ground truth, pixel-identity
conservative subset, MC band not separately v4-validated.
- §VI Future Work: four directions including a Paper B
placeholder for audit-quality companion analysis.
The technical §III v6 + §IV v3.2 are the foundation; this Phase
4 draft aligns the narrative with the codex-converged
methodology and results.
6 close-out items flagged at end of file (word-count check,
contribution count, LOOO citation, limitations grouping, Paper B
cross-ref, draft note stripping).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|