Paper A v3.18.1: address remaining partner red-pen prose clarity items
Three targeted fixes per partner's red-pen audit (residue from v3.18 cleanup):
1. III-D 92.6% match rate -- partner red-circled the bare figure ("不太懂改善線").
Add explicit explanation of the unmatched 7.4% (13,573 signatures): they
could not be matched to a registered CPA name (deviation from two-signature
layout, OCR-name mismatch) and are excluded from same-CPA pairwise analyses
for definitional reasons, not discarded as noise.
2. III-I.1 Hartigan dip-test wording -- partner wrote "?所以為何?" next to the
"rejecting unimodality is consistent with but does not directly establish
bimodality" sentence. Replace with a direct three-line explanation: the
test asks "is the distribution single-peaked?", a non-significant p means
we cannot reject single-peak, a significant p means more than one peak
(could be 2/3/...). Removes the partner's confusion without losing rigor.
3. IV-G validation lead-in -- partner wrote "不太懂為何陳述?" on the
tangled "consistency check / threshold-free / operational classifier"
triple. Rewrite as a three-bullet structure that names the *informative
quantity* in each subsection (temporal trend / concentration ratio /
cross-firm gap) and states explicitly why each is robust to cutoff choice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -232,11 +232,14 @@ The paper therefore retains cos $> 0.95$ as the primary operational cut for tran
|
||||
|
||||
## G. Additional Firm A Benchmark Validation
|
||||
|
||||
The capture rates of Section IV-E are a within-sample consistency check: they evaluate how well a threshold captures Firm A, but the thresholds themselves are anchored to Firm A's percentiles.
|
||||
This section reports three complementary analyses that go beyond the whole-sample capture rates.
|
||||
Subsection H.2 is fully threshold-independent (it uses only ordinal ranking).
|
||||
Subsection H.1 uses a fixed 0.95 cutoff but derives information from the longitudinal stability of rates rather than from the absolute rate at any single year.
|
||||
Subsection H.3 applies the calibrated classifier and is therefore a consistency check on the classifier's firm-level output rather than a threshold-free test; the informative quantity is the cross-firm *gap* rather than the absolute agreement rate at any one firm.
|
||||
The capture rates of Section IV-E are an *internal* consistency check: they ask "how much of Firm A does our threshold capture?", but the threshold was itself derived from Firm A's percentiles, so a high capture rate is not surprising.
|
||||
To go beyond this circular check, we report three further analyses, each chosen so that the *informative quantity* does not depend on the threshold's absolute value:
|
||||
|
||||
- **§IV-G.1 (year-by-year stability).** Holds the cosine cutoff fixed at 0.95 and asks whether the share of Firm A below the cutoff is *stable across years*. The information is in the temporal trend, not in the absolute rate; under a noise-only explanation of the left tail, the share should shrink as scan/PDF technology matured.
|
||||
- **§IV-G.2 (partner-level similarity ranking).** Uses *no threshold at all*: every auditor-year is ranked by mean similarity, and we measure Firm A's share of the top decile against its baseline share. The information is in the concentration ratio, which is invariant to the choice of cutoff.
|
||||
- **§IV-G.3 (intra-report agreement).** Applies the calibrated classifier and measures whether the *two co-signing CPAs on the same Firm A report* receive the same classifier label, then compares Firm A's intra-report agreement rate to the other firms'. The information is in the *cross-firm gap*; the absolute agreement rate at any one firm depends on the cutoff, but the gap is robust to moderate cutoff shifts as long as the same cutoff is applied uniformly across firms.
|
||||
|
||||
Together these three analyses provide threshold-free or threshold-robust evidence that complements the within-sample capture rates of Section IV-E.
|
||||
|
||||
### 1) Year-by-Year Stability of the Firm A Left Tail
|
||||
|
||||
|
||||
Reference in New Issue
Block a user