§III-L.0: reword e-signature-adoption rationale as industry background (no interview citation)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-04 21:52:05 +08:00
parent 1eb323e959
commit 17156516a0
2 changed files with 1 additions and 1 deletions
+1 -1
View File
@@ -458,7 +458,7 @@ The operational classifier defined in §III-H.1 is calibrated by characterising
**Choice of negative-anchor pool.** A negative anchor must approximate a population in which the rule should *not* fire — independent CPAs whose signatures coincide only by chance. §III-L.4 shows that under the deployed rule, $98.8\%$ of Firm A's inter-CPA collisions fall on other Firm-A CPAs, and byte-level evidence (§IV-H, supplementary materials) confirms image-level reuse across $\sim 50$ Firm-A partners. Including Firm A in the negative-anchor pool therefore loads the "coincidence" rate with structured within-firm collisions, not chance coincidence — a circularity, since that collision structure is the phenomenon the rule targets. We adopt **Firms B/C/D (BCD) as the normative negative-anchor baseline** and report the all-Big-4 (ABCD) pool only as a contamination-comparison scope; Firm A enters as an **out-of-sample target** (§III-L.4), not as a calibration input. A still-broader baseline adding the eligible non-Big-4 firms (BCD+non-Big-4) is reported as a robustness scope.
We further restrict the calibration baseline temporally to **fiscal years 20132019**. Taiwan audit firms progressively adopted electronic-signature systems after 2020 (with firm-specific timing), so the pre-2020 BCD period is the construct-clean hand-signing baseline; the post-2020 period mixes genuine hand-signing with legitimate e-signing and is therefore not a clean negative anchor. The data corroborate this: the BCD per-comparison HC floor rises from $0.000010$ (20132019) to $0.000036$ (20202023), and the per-signature floor from $0.0059$ to $0.0105$ — the gradual, non-stepped rise being consistent with staggered per-firm adoption. We therefore calibrate on BCD 20132019 and report BCD 20202023 only as a robustness scope (it documents the e-signing contamination rather than the clean floor). Firm A is scored across its full 20132023 record against this clean threshold.
We further restrict the calibration baseline temporally to **fiscal years 20132019**. Following the post-2020 acceleration of digital document workflows, Taiwan audit firms increasingly adopted electronic-signature and stamping systems for report assembly, with firm-specific timing; the pre-2020 BCD period is therefore the construct-clean hand-signing baseline, while the post-2020 period mixes genuine hand-signing with legitimate e-signing and is not a clean negative anchor. The data corroborate this: the BCD per-comparison HC floor rises from $0.000010$ (20132019) to $0.000036$ (20202023), and the per-signature floor from $0.0059$ to $0.0105$ — the gradual, non-stepped rise being consistent with staggered per-firm adoption. We therefore calibrate on BCD 20132019 and report BCD 20202023 only as a robustness scope (it documents the e-signing contamination rather than the clean floor). Firm A is scored across its full 20132023 record against this clean threshold.
**Calibration role of the present analysis.** The deployed thresholds of §III-H.1 preserve continuity with the existing literature and the supplementary calibration evidence. §III-I.4 establishes that a recalibration cannot be anchored on distributional antimodes (no within-population bimodality exists); §III-L.1 below characterises the cosine and structural ($\text{dHash} \leq 5$) thresholds' specificity-proxy behaviour at the inter-CPA pair level on the BCD baseline. The sub-band thresholds ($\text{dHash} = 15$, $\text{cos} = 0.837$) retain their supplementary calibration evidence; the present calibration does not provide independent rates for those sub-bands. The cosine LH/UN crossover $\text{cos} = 0.837$ is a corpus-wide descriptor-space landmark (intra- vs inter-CPA cosine KDE crossover, §IV-C) robust to baseline choice — it moves by at most $0.012$ across the corpus-wide, BCD, and BCD+non-Big-4 scopes ($0.8367$, $0.8489$, $0.8302$) — so we retain the corpus-wide value and do not re-anchor it on BCD.