Paper A v3.18.1: address remaining partner red-pen prose clarity items

Three targeted fixes per partner's red-pen audit (residue from v3.18 cleanup): 1. III-D 92.6% match rate -- partner red-circled the bare figure ("不太懂改善線"). Add explicit explanation of the unmatched 7.4% (13,573 signatures): they could not be matched to a registered CPA name (deviation from two-signature layout, OCR-name mismatch) and are excluded from same-CPA pairwise analyses for definitional reasons, not discarded as noise. 2. III-I.1 Hartigan dip-test wording -- partner wrote "?所以為何?" next to the "rejecting unimodality is consistent with but does not directly establish bimodality" sentence. Replace with a direct three-line explanation: the test asks "is the distribution single-peaked?", a non-significant p means we cannot reject single-peak, a significant p means more than one peak (could be 2/3/...). Removes the partner's confusion without losing rigor. 3. IV-G validation lead-in -- partner wrote "不太懂為何陳述?" on the tangled "consistency check / threshold-free / operational classifier" triple. Rewrite as a three-bullet structure that names the *informative quantity* in each subsection (temporal trend / concentration ratio / cross-firm gap) and states explicitly why each is robust to cutoff choice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:48:59 +08:00
parent 16e90bab20
commit cb77f481ec
3 changed files with 14 additions and 6 deletions
@@ -74,6 +74,7 @@ Batch inference on all 86,071 documents extracted 182,328 signature images at a
 A red stamp removal step was applied to each cropped signature using HSV color-space filtering, replacing detected red regions with white pixels to isolate the handwritten content.

 Each signature was matched to its corresponding CPA using positional order (first or second signature on the page) against the official CPA registry, achieving a 92.6% match rate (168,755 of 182,328 signatures).
+The remaining 7.4% (13,573 signatures) could not be matched to a registered CPA name---typically because the auditor's report page format deviates from the standard two-signature layout, or because OCR of the printed CPA name on the page returns a name not present in the registry---and these signatures are excluded from all subsequent same-CPA pairwise analyses (a same-CPA best-match statistic is undefined when a signature has no assigned CPA). The 92.6% matched subset is the sample that flows into Sections IV-D through IV-H; the unmatched 7.4% are excluded for definitional reasons rather than discarded as noise.

 ## E. Feature Extraction

@@ -188,7 +189,11 @@ Because all three diagnostics are applied to the same sample rather than to inde
 We use two closely related KDE-based threshold estimators and apply each where it is appropriate.
 When two labeled populations are available (e.g., the all-pairs intra-class and inter-class similarity distributions of Section IV-C), the *KDE crossover* is the intersection point of the two kernel density estimates under Scott's rule for bandwidth selection [28]; under equal priors and symmetric misclassification costs it approximates the Bayes-optimal decision boundary between the two classes.
 When a single distribution is analysed (e.g., the per-signature best-match cosine distribution of Section IV-D) the *KDE antimode* is the local density minimum between two modes of the fitted density; it serves the same decision-theoretic role when the distribution is multimodal but is undefined when the distribution is unimodal.
-In either case we use the Hartigan & Hartigan dip test [37] as a formal test of unimodality (rejecting the null of unimodality is consistent with but does not directly establish bimodality specifically), and perform a sensitivity analysis varying the bandwidth over $\pm 50\%$ of the Scott's-rule value to verify threshold stability.
+In either case we use the Hartigan & Hartigan dip test [37] as a formal test of unimodality.
+The dip test asks one question: *is the distribution single-peaked?*
+A non-significant $p$-value means we cannot reject the single-peak null (the data are consistent with one peak); a significant $p$-value means the distribution has *more than one peak* (it could be two, three, or more---the test does not specify how many).
+We use the test to decide whether a KDE antimode is well-defined (it is, only when there is more than one peak), not to assert any particular number of components.
+We additionally perform a sensitivity analysis varying the bandwidth over $\pm 50\%$ of the Scott's-rule value to verify threshold stability.

 ### 2) Method 2: Finite Mixture Model via EM