Files
gbanyan e429e4eed1 Bootstrap .planning/ for Paper A v4.0 milestone
Hand-written minimal GSD scaffolding (PROJECT.md / REQUIREMENTS.md /
ROADMAP.md / STATE.md) without running /gsd-ingest-docs because:

  * 51 pre-existing markdown files exceed the v1 50-doc cap and most
    are stale (older review rounds, infrastructure notes) or already
    captured in auto-memory project_signature_research.md
  * Heavyweight ingest workflow not needed when project context is
    already comprehensive

PROJECT.md captures the Big-4 reframe key decision and the locked
v3.x history; REQUIREMENTS.md defines REQ-001..008 for v4.0;
ROADMAP.md lays out 7 phases (Foundation -> Methodology -> Results
-> Prose -> AI peer review -> Partner re-review -> Submission);
STATE.md anchors at Phase 1 entry on branch paper-a-v4-big4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 14:43:34 +08:00

86 lines
5.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Requirements — Paper A v4.0 (Big-4 reframe)
Milestone: Paper A v4.0 IEEE Access submission with Big-4-only primary scope and full-dataset secondary robustness.
## REQ-001: Big-4-only primary scope (foundation)
**What**: All primary statistical analysis (KDE+dip, BD/McCrary, Beta mixture, 2D-GMM K=2/K=3, pixel-identity FAR, held-out 70/30 z-test, classifier sensitivity) is rerun on the 437-CPA Big-4 subset (Firm A + KPMG + PwC + EY, n_signatures ≥ 10).
**Acceptance**:
- Script 20 rerun on Big-4 subset, dip-test p < 0.05 on cos_mean and dh_mean
- Script 21 (held-out validation) rerun on Big-4 subset
- Script 24 (calibration vs held-out z-test, classifier sensitivity) rerun on Big-4 subset
- Script 19 (pixel-identity / FAR) rerun on Big-4 subset
- All rerun outputs land under `reports/v4_big4/`
- New operational threshold cos > 0.975 AND dh ≤ 3.76 (or refined K=2 posterior) documented with bootstrap 95% CI
## REQ-002: Full-dataset robustness as secondary section
**What**: §IV-K (new) reports the full-dataset (686 CPA) version of the same analyses as a robustness check, demonstrating the pipeline runs at multiple scopes and explaining why the published v3.x 0.945 threshold drifted (mid/small-firm tail heterogeneity).
**Acceptance**:
- §IV-K table comparing Big-4-only vs full-dataset crossings, with mid/small-firm contribution analysis
- Explicit explanation of why Big-4 is the methodologically privileged primary scope
## REQ-003: Methodology rewrite (§III-G / I / J / L)
**What**: Sections III-G (unit hierarchy / scope), III-I (threshold estimators), III-J (accountant-level GMM), III-L (per-document classifier rule) rewritten to reflect dip-test confirmed bimodality and the new K=2-derived classifier rule.
**Acceptance**:
- §III-G justifies Big-4 as the methodological unit (sample size, homogeneity, dip-test evidence)
- §III-I anchored on bootstrap-stable bimodal evidence rather than three-method convergence on unimodal data
- §III-J reports K=2 as primary (interpretable: replicated vs hand-leaning) with K=3 BIC slightly preferred (-1112 vs -1108) as secondary
- §III-L derives operational rule from Big-4 K=2 components and bootstrap CI
## REQ-004: Results tables IV-XVIII regenerated
**What**: All results tables in §IV (currently Tables IV through XVIII at v3.20.0) regenerated on the Big-4 subset with consistent formatting and footnote citation to source script.
**Acceptance**:
- Each table cites the script + DB query that generated it
- Big-4 numbers replace full-dataset numbers as primary; full-dataset relegated to §IV-K
- Figures 1-4 regenerated; Fig 4 (yearly per-firm) likely reusable as-is
## REQ-005: Firm A reframed as templated case study
**What**: Throughout the manuscript, Firm A's role pivots from "calibration anchor (with minority hand-signers)" to "case study of the templated end of Big-4 (0% in K=3 hand-sign-leaning cluster, 82.5% in replicated cluster)". PwC's higher hand-sign tradition (24/102 = 23.5% in C1) noted as a Big-4 internal contrast.
**Acceptance**:
- Discussion (§V) explicitly states Firm A is the most digitally-replicated of Big-4
- Cross-tab table (firm × cluster) included in either §IV or §V
- Conclusion's contributions list updated accordingly
## REQ-006: AI peer review (≥3 rounds)
**What**: At least three cross-AI peer-review rounds on the v4.0 manuscript using codex (GPT-5.x), Gemini 3.x Pro, and Opus 4.7 max effort. Per `[[feedback-ai-review-provenance]]` memory: every reviewer-flagged empirical claim must be provenance-verified against fresh sqlite/grep against the named script.
**Acceptance**:
- Round 1 verdict obtained from each of the three reviewers
- All Major-class findings either RESOLVED in revision or explicitly disclaimed
- Final round produces ≥1 Accept / Minor verdict from at least 2 of 3 reviewers
## REQ-007: Partner Jimmy second review on v4.0
**What**: Jimmy (who proposed Big-4-only direction) reviews the v4.0 manuscript end-to-end before submission.
**Acceptance**:
- v4.0 DOCX shipped to ~/Downloads
- Jimmy's response captured in repo (paper/partner_jimmy_v4_review.md)
- Any must-fix items resolved in v4.0.x
## REQ-008: iThenticate + eCF + submission
**What**: iThenticate similarity check below 20%, IEEE eCF copyright form completed, manuscript uploaded via IEEE Access submission portal with cover letter.
**Acceptance**:
- iThenticate report saved under `paper/ithenticate_v4.pdf`
- eCF confirmation captured
- Submission portal confirmation number recorded in PROJECT.md "Validated" section
## Cross-cutting constraints
- **Reproducibility**: every script accepts a `--scope big4|full` flag (or new scripts under `signature_analysis/v4_*` if a flag refactor is too invasive)
- **Provenance**: every numeric claim in the paper traces to (script_id, DB query, output file) — see `[[feedback-provenance-fabrication]]`
- **No data re-ingest**: existing `/Volumes/NV2/PDF-Processing/signature-analysis/signature_analysis.db` is the frozen snapshot
- **Branch isolation**: all v4.0 work on `paper-a-v4-big4`; do NOT merge back to `yolo-signature-pipeline` until v4.0 is partner-approved