Paper A v3.9: resolve codex round-8 regressions (Table XV baseline + cross-refs)

Codex round-8 (paper/codex_review_gpt54_v3_8.md) dissented from Gemini's Accept and gave Minor Revision because of two real numerical/consistency issues Gemini's round-7 review missed. This commit fixes both. Table XV per-year Firm A baseline-share column corrected - All 11 yearly values resynced to the authoritative reports/partner_ranking/partner_ranking_report.md (per-year Deloitte baseline share column): 2013: 26.2% -> 32.4% (largest error; codex's test case) 2014: 27.1% -> 27.8% 2015: 27.2% -> 27.7% 2016: 27.4% -> 26.2% 2017: 27.9% -> 27.2% 2018: 28.1% -> 26.5% 2019: 28.2% -> 27.0% 2020: 28.3% -> 27.7% 2021: 28.4% -> 28.7% 2022: 28.5% -> 28.3% 2023: 28.5% -> 27.4% - Codex independently verified that the prior 2013 value 26.2% was numerically impossible because the underlying JSON places 97 Firm A auditor-years in the 2013 top-50% bucket out of 324 total, so the full-year baseline must be at least 97/324 = 29.9%. - All other Table XV columns (N, Top-10% k, in top-10%, share) were already correct and unchanged. Broken cross-references from earlier renumbering repaired - Methodology III-E: "ablation study (Section IV-F)" pointer corrected to "Section IV-J"; the ablation is at Section IV-J line 412 in the current Results, while IV-F is now "Calibration Validation with Firm A". - Results Table XVIII note: "per-signature best-match values in Tables IV/VI (mean = 0.980)" is orphaned after earlier renumbering (Table IV is all-pairs distributional statistics; Table VI is accountant-level GMM model selection). Replaced with an explicit pointer to "Section IV-D and visualized in Table XIII (whole-sample Firm A best-match mean ~ 0.980)". Table XIII is the correct container of per-signature best-match mean statistics. All other Section IV-X cross-references in methodology / results / discussion were spot-checked and remain correct under the current section numbering. With these two surgical fixes, codex's round-8 ranked items (1) and (2) are cleared. Item (3) was the final DOCX packaging pass (author metadata fill-in, figure rendering, reference formatting) which is done manually at submission time and does not affect the markdown. Deferred items remain deferred: - Visual-inspection protocol details (codex round-5 item 4) - General reproducibility appendix (codex round-5 item 6) Both are defensible for first IEEE Access submission per codex round-8 assessment, since the manuscript no longer leans on visual inspection or BD/McCrary as decisive standalone evidence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:59:27 +08:00
parent fcce58aff0
commit 85cfefe49f
3 changed files with 3276 additions and 14 deletions
@@ -84,7 +84,7 @@ Preprocessing consisted of resizing to 224×224 pixels with aspect-ratio preserv
 All feature vectors were L2-normalized, ensuring that cosine similarity equals the dot product.

 The choice of ResNet-50 without fine-tuning was motivated by three considerations: (1) the task is similarity comparison rather than classification, making general-purpose discriminative features sufficient; (2) ImageNet features have been shown to transfer effectively to document analysis tasks [20], [21]; and (3) avoiding domain-specific fine-tuning reduces the risk of overfitting to dataset-specific artifacts, though we note that a fine-tuned model could potentially improve discriminative performance (see Section V-D).
-This design choice is validated by an ablation study (Section IV-F) comparing ResNet-50 against VGG-16 and EfficientNet-B0.
+This design choice is validated by an ablation study (Section IV-J) comparing ResNet-50 against VGG-16 and EfficientNet-B0.

 ## F. Dual-Method Similarity Descriptors

@@ -328,17 +328,17 @@ Year-by-year (Table XV), the top-10% Firm A share ranges from 88.4% (2020) to 10
 <!-- TABLE XV: Firm A Share of Top-10% Similarity by Year
 | Year | N auditor-years | Top-10% k | Firm A in top-10% | Firm A share | Firm A baseline |
 |------|-----------------|-----------|-------------------|--------------|-----------------|
-| 2013 | 324 | 32 | 32 | 100.0% | 26.2% |
-| 2014 | 399 | 39 | 39 | 100.0% | 27.1% |
-| 2015 | 394 | 39 | 38 | 97.4% | 27.2% |
-| 2016 | 413 | 41 | 39 | 95.1% | 27.4% |
-| 2017 | 415 | 41 | 41 | 100.0% | 27.9% |
-| 2018 | 434 | 43 | 43 | 100.0% | 28.1% |
-| 2019 | 429 | 42 | 42 | 100.0% | 28.2% |
-| 2020 | 430 | 43 | 38 | 88.4% | 28.3% |
-| 2021 | 450 | 45 | 44 | 97.8% | 28.4% |
-| 2022 | 467 | 46 | 43 | 93.5% | 28.5% |
-| 2023 | 474 | 47 | 46 | 97.9% | 28.5% |
+| 2013 | 324 | 32 | 32 | 100.0% | 32.4% |
+| 2014 | 399 | 39 | 39 | 100.0% | 27.8% |
+| 2015 | 394 | 39 | 38 | 97.4% | 27.7% |
+| 2016 | 413 | 41 | 39 | 95.1% | 26.2% |
+| 2017 | 415 | 41 | 41 | 100.0% | 27.2% |
+| 2018 | 434 | 43 | 43 | 100.0% | 26.5% |
+| 2019 | 429 | 42 | 42 | 100.0% | 27.0% |
+| 2020 | 430 | 43 | 38 | 88.4% | 27.7% |
+| 2021 | 450 | 45 | 44 | 97.8% | 28.7% |
+| 2022 | 467 | 46 | 43 | 93.5% | 28.3% |
+| 2023 | 474 | 47 | 46 | 97.9% | 27.4% |
 -->

 This over-representation is a direct consequence of firm-wide non-hand-signing practice and is not derived from any threshold we subsequently calibrate.
@@ -428,8 +428,9 @@ Table XVIII presents the comparison.

 Note: Firm A values in this table are computed over all intra-firm pairwise
 similarities (16.0M pairs) for cross-backbone comparability. These differ from
-the per-signature best-match values in Tables IV/VI (mean = 0.980), which reflect
-the classification-relevant statistic: the similarity of each signature to its
+the per-signature best-match statistic used in Section IV-D and visualized in
+Table XIII (whole-sample Firm A best-match mean $\approx 0.980$), which reflects
+the classification-relevant quantity: the similarity of each signature to its
 single closest match from the same CPA.
 -->