71c4e8f736
docs(04-01): complete known gene compilation and weighted scoring plan
...
- Known genes: 38 (10 OMIM Usher + 28 SYSCILIA SCGS v2 core)
- ScoringWeights.validate_sum() enforcing weight sum = 1.0
- NULL-preserving weighted average (weighted_sum / available_weight)
- Quality flags based on evidence_count thresholds
- Per-layer contributions for explainability
- 2 tasks, 4 files, 4 min duration
2026-02-11 20:44:09 +08:00
a52724aff4
docs(04): create phase plan for scoring and integration
2026-02-11 20:31:55 +08:00
32988c631f
docs(04): research multi-evidence weighted scoring with NULL preservation
2026-02-11 20:24:42 +08:00
190bedaa80
docs(phase-03): complete phase execution
2026-02-11 19:18:12 +08:00
e72c516669
docs(03-06): complete literature evidence layer
...
- Created SUMMARY.md with full implementation details
- Updated STATE.md: progress 60%, 12/20 plans complete, Phase 3 complete
- Documented 4 key decisions (tier priority, bias mitigation, context weights, rate limiting)
- All verification criteria met: 17/17 tests pass, CLI functional, bias mitigation validated
- Self-check PASSED: all files and commits verified
Key accomplishments:
- PubMed evidence layer queries per gene across cilia/sensory/cytoskeleton/polarity contexts
- Quality tier classification: direct_experimental > hts_hit > functional_mention > incidental
- Bias mitigation via log2(total_pubmed_count) prevents well-studied gene dominance
- Novel genes with 10 total/5 cilia publications score higher than TP53-like genes with 100K total/5 cilia
- Biopython Entrez integration with rate limiting (3/sec default, 10/sec with API key)
2026-02-11 19:13:26 +08:00
0e89bf0dd6
docs(03-02): complete expression evidence layer plan
...
- Create 03-02-SUMMARY.md with performance metrics, decisions, and deviations
- Update STATE.md: 5 of 6 plans complete in Phase 03 (03-06 remaining)
- Update progress: 55% complete (11/20 plans across all phases)
- Add key decisions: Tau calculation, expression scoring, CellxGene optional
- Record duration: 12 min for 2 tasks (9 files modified)
- Self-check passed: all files and commits verified
Expression layer provides:
- HPA/GTEx tissue expression with Tau specificity index
- Usher-tissue enrichment scoring (retina, inner ear, cilia)
- Optional CellxGene single-cell integration
- CLI command with checkpoint-restart
- 11 passing unit and integration tests
2026-02-11 19:12:18 +08:00
cfe4b830e6
docs(03-03): complete protein features plan with SUMMARY and STATE updates
2026-02-11 19:10:03 +08:00
053f0d926b
docs(03-05): complete animal model phenotype evidence layer plan
...
- SUMMARY.md: Ortholog-mapped animal evidence from MGI/ZFIN/IMPC
- Confidence-weighted scoring (mouse +0.4, zebrafish +0.3, IMPC +0.3)
- 14/14 tests passing: ortholog confidence, keyword filtering, NULL preservation
- Deviations: Schema mismatches, NULL handling, polars deprecations auto-fixed
- Duration: 10 minutes, 2 tasks, 8 files, 2 commits
2026-02-11 19:08:45 +08:00
d8009f1236
docs(03-04): complete subcellular localization evidence layer
...
- Created SUMMARY.md with full implementation details
- Updated STATE.md: progress 40%, 8/20 plans complete
- Documented 4 key decisions (evidence terminology, NULL semantics, embedded proteomics, evidence weighting)
- All verification criteria met: 17/17 tests pass, CLI functional, DuckDB integration complete
2026-02-11 19:08:01 +08:00
99bc975a2c
docs(03-01): complete annotation completeness plan
2026-02-11 19:05:56 +08:00
0d252da348
docs(03): create phase plan
2026-02-11 18:46:28 +08:00
3354cfe006
docs(phase-03): research core evidence layers domain
2026-02-11 18:37:14 +08:00
ffb4963d2b
docs(phase-02): complete phase execution
2026-02-11 18:28:13 +08:00
a0388cf4e1
docs(02-02): complete gnomAD evidence layer integration plan
...
- DuckDB persistence: gnomad_constraint table with CREATE OR REPLACE (idempotent)
- CLI evidence command: usher-pipeline evidence gnomad with checkpoint-restart
- Provenance tracking: records processing steps, saves sidecar JSON
- Query helpers: query_constrained_genes validates GCON-03 interpretation
- 12 integration tests: end-to-end pipeline, checkpoint, provenance, CLI
- Phase 2 complete: Evidence layer pattern established for future sources
- Duration: 4 min, 2 tasks, 5 files, 70 tests passing
Phase 2 (Prototype Evidence Layer) complete.
2026-02-11 18:23:32 +08:00
c6198122ac
docs(02-01): complete gnomAD constraint data pipeline plan
2026-02-11 18:16:35 +08:00
c7753e7b1c
docs(02): create phase plan
2026-02-11 17:47:23 +08:00
d328467737
docs(phase-02): research prototype evidence layer
2026-02-11 17:41:35 +08:00
34437fdf0a
docs(phase-01): complete phase execution
...
Phase 1 (Data Infrastructure) verified: 5/5 must-haves, 12/12 artifacts,
9/9 key links, 7/7 requirements satisfied. All 4 plans executed across
3 waves with 49 tests passing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-11 16:50:30 +08:00
102dcdbe84
docs(01-04): complete CLI integration and end-to-end testing plan
...
- CLI entry point with setup and info commands
- Full infrastructure integration verified
- 6 integration tests with mocked APIs
- Phase 01 Data Infrastructure complete
2026-02-11 16:45:12 +08:00
e29d39d1dc
docs(01-02): complete gene ID mapping and validation plan
...
- Gene universe definition with mygene protein-coding gene retrieval
- Batch Ensembl->HGNC+UniProt mapping with edge case handling
- Validation gates with configurable success rate thresholds
- 15 comprehensive tests with mocked API responses
2026-02-11 16:35:57 +08:00
92322b1d7c
docs(01-03): complete DuckDB persistence and provenance tracking plan
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-11 16:34:00 +08:00
9ee3ec2e84
docs(01-01): complete project scaffold and config system plan
...
- Created comprehensive SUMMARY.md with all execution details
- Updated STATE.md: 1/4 plans in phase 1 complete, 16.7% overall progress
- Documented deviation (venv creation) and decisions
- Verified all files and commits exist (self-check passed)
2026-02-11 16:28:03 +08:00
cab2f5fc66
docs(01-data-infrastructure): create phase plan
2026-02-11 16:04:42 +08:00
982f7f5a9b
docs(01-data-infrastructure): research phase domain
2026-02-11 15:56:40 +08:00
f80f384a61
docs: create roadmap (6 phases)
2026-02-11 15:47:36 +08:00
0fb1a9581f
docs: define v1 requirements
2026-02-11 15:31:05 +08:00
bb7bfaedab
docs: complete project research
2026-02-11 14:52:06 +08:00
c0abe8bc6c
chore: add project config
2026-02-11 14:41:35 +08:00
e2c202d689
docs: initialize project
2026-02-11 14:40:36 +08:00