- Create ReproducibilityReport dataclass with all metadata fields
- Implement generate_reproducibility_report function
- Extract parameters from PipelineConfig (scoring weights, data versions)
- Capture software environment (Python, polars, duckdb versions)
- Build filtering steps from ProvenanceTracker
- Compute tier statistics from tiered DataFrame
- Support optional validation metrics
- to_json: write as indented JSON for machine-readable format
- to_markdown: write with tables and headers for human-readable format
- 7 tests covering all report fields, formats, and edge cases
- Add matplotlib>=3.8.0 and seaborn>=0.13.0 to dependencies
- Create visualizations.py with 3 plot functions and orchestrator
- plot_score_distribution: histogram colored by confidence tier
- plot_layer_contributions: bar chart of evidence layer coverage
- plot_tier_breakdown: pie chart of tier distribution
- Use Agg backend for headless/CLI safety
- All plots saved at 300 DPI with proper figure cleanup
- 6 tests covering file creation, edge cases, and return values
- test_scoring.py: 7 unit tests for known genes, weight validation, NULL preservation
- test_scoring_integration.py: 3 integration tests for end-to-end pipeline with synthetic data
- Tests verify NULL handling (genes with no evidence get NULL composite score)
- Tests verify known genes rank highly when given high scores
- Tests verify QC detects missing data above thresholds
- All tests use synthetic data (no external API calls, fast, reproducible)
- Create protein features data model with domain, coiled-coil, TM, cilia motifs
- Implement fetch.py with UniProt REST API and InterPro API queries
- Implement transform.py with feature extraction, motif detection, normalization
- Implement load.py with DuckDB persistence and provenance tracking
- Add CLI protein command following evidence layer pattern
- Add comprehensive unit and integration tests (all passing)
- Handle NULL preservation and List(Null) edge case
- Add get_steps() method to ProvenanceTracker for test compatibility
- Add localization subcommand to evidence command group
- Implement checkpoint-restart pattern for HPA download
- Display summary with evidence type distribution
- Create 17 unit and integration tests (all pass)
- Test HPA parsing, evidence classification, scoring, and DuckDB persistence
- Fix evidence type terminology (computational vs predicted) for consistency
- Mock HTTP calls in integration tests for reproducibility