Codex (gpt-5.4) second-round review recommended 'minor revision'. This
commit addresses all issues flagged in that review.
## Structural fixes
- dHash calibration inconsistency (codex #1, most important):
Clarified in Section III-L that the <=5 and <=15 dHash cutoffs come
from the whole-sample Firm A cosine-conditional dHash distribution
(median=5, P95=15), not from the calibration-fold independent-minimum
dHash distribution (median=2, P95=9) which we report elsewhere as
descriptive anchors. Added explicit note about the two dHash
conventions and their relationship.
- Section IV-H framing (codex #2):
Renamed "Firm A Benchmark Validation: Threshold-Independent Evidence"
to "Additional Firm A Benchmark Validation" and clarified in the
section intro that H.1 uses a fixed 0.95 cutoff, H.2 is fully
threshold-free, H.3 uses the calibrated classifier. H.3's concluding
sentence now says "the substantive evidence lies in the cross-firm
gap" rather than claiming the test is threshold-free.
- Table XVI 93,979 typo fixed (codex #3):
Corrected to 84,354 total (83,970 same-firm + 384 mixed-firm).
- Held-out Firm A denominator 124+54=178 vs 180 (codex #4):
Added explicit note that 2 CPAs were excluded due to disambiguation
ties in the CPA registry.
- Table VIII duplication (codex #5):
Removed the duplicate accountant-level-only Table VIII comment; the
comprehensive cross-level Table VIII subsumes it. Text now says
"accountant-level rows of Table VIII (below)".
- Anonymization broken in Tables XIV-XVI (codex #6):
Replaced "Deloitte"/"KPMG"/"PwC"/"EY" with "Firm A"/"Firm B"/"Firm C"/
"Firm D" across Tables XIV, XV, XVI. Table and caption language
updated accordingly.
- Table X unit mismatch (codex #7):
Dropped precision, recall, F1 columns. Table now reports FAR
(against the inter-CPA negative anchor) with Wilson 95% CIs and FRR
(against the byte-identical positive anchor). III-K and IV-G.1 text
updated to justify the change.
## Sentence-level fixes
- "three independent statistical methods" in Methodology III-A ->
"three methodologically distinct statistical methods".
- "three independent methods" in Conclusion -> "three methodologically
distinct methods".
- Abstract "~0.006 converging" now explicitly acknowledges that
BD/McCrary produces no significant accountant-level discontinuity.
- Conclusion ditto.
- Discussion limitation sentence "BD/McCrary should be interpreted at
the accountant level for threshold-setting purposes" rewritten to
reflect v3.3 result that BD/McCrary is a diagnostic, not a threshold
estimator, at the accountant level.
- III-H "two analyses" -> "three analyses" (H.1 longitudinal stability,
H.2 partner ranking, H.3 intra-report consistency).
- Related Work White 1982 overclaim rewritten: "consistent estimators
of the pseudo-true parameter that minimizes KL divergence" replaces
"guarantees asymptotic recovery".
- III-J "behavior is close to discrete" -> "practice is clustered".
- IV-D.2 pivot sentence "discreteness of individual behavior yields
bimodality" -> "aggregation over signatures reveals clustered (though
not sharply discrete) patterns".
Target journal remains IEEE Access. Output:
Paper_A_IEEE_Access_Draft_v3.docx (395 KB).
Codex v3.2 review saved to paper/codex_review_gpt54_v3_2.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>