Paper A v3.9: resolve codex round-8 regressions (Table XV baseline + cross-refs)

Codex round-8 (paper/codex_review_gpt54_v3_8.md) dissented from
Gemini's Accept and gave Minor Revision because of two real
numerical/consistency issues Gemini's round-7 review missed. This
commit fixes both.

Table XV per-year Firm A baseline-share column corrected
- All 11 yearly values resynced to the authoritative
  reports/partner_ranking/partner_ranking_report.md (per-year
  Deloitte baseline share column):
    2013: 26.2% -> 32.4%  (largest error; codex's test case)
    2014: 27.1% -> 27.8%
    2015: 27.2% -> 27.7%
    2016: 27.4% -> 26.2%
    2017: 27.9% -> 27.2%
    2018: 28.1% -> 26.5%
    2019: 28.2% -> 27.0%
    2020: 28.3% -> 27.7%
    2021: 28.4% -> 28.7%
    2022: 28.5% -> 28.3%
    2023: 28.5% -> 27.4%
- Codex independently verified that the prior 2013 value 26.2% was
  numerically impossible because the underlying JSON places 97 Firm
  A auditor-years in the 2013 top-50% bucket out of 324 total, so
  the full-year baseline must be at least 97/324 = 29.9%.
- All other Table XV columns (N, Top-10% k, in top-10%, share) were
  already correct and unchanged.

Broken cross-references from earlier renumbering repaired
- Methodology III-E: "ablation study (Section IV-F)" pointer
  corrected to "Section IV-J"; the ablation is at Section IV-J
  line 412 in the current Results, while IV-F is now "Calibration
  Validation with Firm A".
- Results Table XVIII note: "per-signature best-match values in
  Tables IV/VI (mean = 0.980)" is orphaned after earlier
  renumbering (Table IV is all-pairs distributional statistics;
  Table VI is accountant-level GMM model selection). Replaced with
  an explicit pointer to "Section IV-D and visualized in Table XIII
  (whole-sample Firm A best-match mean ~ 0.980)". Table XIII is
  the correct container of per-signature best-match mean statistics.

All other Section IV-X cross-references in methodology / results /
discussion were spot-checked and remain correct under the current
section numbering.

With these two surgical fixes, codex's round-8 ranked items (1) and
(2) are cleared. Item (3) was the final DOCX packaging pass (author
metadata fill-in, figure rendering, reference formatting) which is
done manually at submission time and does not affect the markdown.

Deferred items remain deferred:
- Visual-inspection protocol details (codex round-5 item 4)
- General reproducibility appendix (codex round-5 item 6)
Both are defensible for first IEEE Access submission per codex
round-8 assessment, since the manuscript no longer leans on visual
inspection or BD/McCrary as decisive standalone evidence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-21 14:59:27 +08:00
parent fcce58aff0
commit 85cfefe49f
3 changed files with 3276 additions and 14 deletions
+1 -1
View File
@@ -84,7 +84,7 @@ Preprocessing consisted of resizing to 224×224 pixels with aspect-ratio preserv
All feature vectors were L2-normalized, ensuring that cosine similarity equals the dot product.
The choice of ResNet-50 without fine-tuning was motivated by three considerations: (1) the task is similarity comparison rather than classification, making general-purpose discriminative features sufficient; (2) ImageNet features have been shown to transfer effectively to document analysis tasks [20], [21]; and (3) avoiding domain-specific fine-tuning reduces the risk of overfitting to dataset-specific artifacts, though we note that a fine-tuned model could potentially improve discriminative performance (see Section V-D).
This design choice is validated by an ablation study (Section IV-F) comparing ResNet-50 against VGG-16 and EfficientNet-B0.
This design choice is validated by an ablation study (Section IV-J) comparing ResNet-50 against VGG-16 and EfficientNet-B0.
## F. Dual-Method Similarity Descriptors