Commit Graph

2 Commits

Author SHA1 Message Date
ee27f3ad2f feat(02-02): add DuckDB loader and CLI evidence command for gnomAD
- load_to_duckdb: Saves constraint DataFrame to gnomad_constraint table with provenance tracking
- query_constrained_genes: Queries constrained genes by LOEUF threshold (validates GCON-03 interpretation)
- evidence_cmd.py: CLI command group with gnomad subcommand (fetch->transform->load orchestration)
- Checkpoint-restart: Skips processing if gnomad_constraint table exists (--force to override)
- Full CLI: usher-pipeline evidence gnomad [--force] [--url URL] [--min-depth N] [--min-cds-pct N]
2026-02-11 18:19:07 +08:00
f33b048635 feat(01-04): add CLI entry point with setup and info commands
- Create click-based CLI with command group (--config, --verbose options)
- Add 'info' command displaying pipeline version, config hash, data source versions
- Add 'setup' command orchestrating full infrastructure flow:
  - Load config -> create store/provenance
  - Fetch gene universe (with checkpoint-restart)
  - Map Ensembl IDs to HGNC + UniProt
  - Validate mapping quality gates
  - Save to DuckDB with provenance sidecar
- Update pyproject.toml entry point to usher_pipeline.cli.main:cli
- Add .gitignore for data/, *.duckdb, build artifacts, provenance files
2026-02-11 16:39:50 +08:00