Files

11 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, must_haves
phase plan type wave depends_on files_modified autonomous must_haves
05-output-cli 02 execute 1
src/usher_pipeline/output/visualizations.py
src/usher_pipeline/output/reproducibility.py
pyproject.toml
tests/test_visualizations.py
tests/test_reproducibility.py
true
truths artifacts key_links
Pipeline generates score distribution histogram with tier color coding as PNG
Pipeline generates evidence layer contribution bar chart as PNG
Pipeline generates tier breakdown pie chart as PNG
Reproducibility report documents scoring parameters, data versions, gene counts per filtering step, and validation metrics
Reproducibility report is generated in both JSON (machine-readable) and Markdown (human-readable) formats
path provides exports
src/usher_pipeline/output/visualizations.py matplotlib/seaborn visualization functions
plot_score_distribution
plot_layer_contributions
plot_tier_breakdown
generate_all_plots
path provides exports
src/usher_pipeline/output/reproducibility.py Reproducibility report generation
generate_reproducibility_report
ReproducibilityReport
path provides
tests/test_visualizations.py Tests for visualization file creation
path provides
tests/test_reproducibility.py Tests for report content and formatting
from to via pattern
src/usher_pipeline/output/visualizations.py matplotlib/seaborn to_pandas() conversion for seaborn compatibility to_pandas.*sns.
from to via pattern
src/usher_pipeline/output/reproducibility.py provenance tracker and config reads ProvenanceTracker metadata and PipelineConfig provenance.*create_metadata|config.*model_dump
Create visualization and reproducibility report modules: score distribution plots, evidence layer contribution charts, tier breakdowns, and comprehensive reproducibility documentation in JSON+Markdown formats.

Purpose: Provides the visual and textual reporting layer that makes pipeline results interpretable for researchers and satisfies reproducibility requirements for scientific pipelines. Output: src/usher_pipeline/output/visualizations.py, src/usher_pipeline/output/reproducibility.py, and associated tests.

<execution_context> @/Users/gbanyan/.claude/get-shit-done/workflows/execute-plan.md @/Users/gbanyan/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/05-output-cli/05-RESEARCH.md @src/usher_pipeline/config/schema.py @src/usher_pipeline/persistence/provenance.py @src/usher_pipeline/scoring/quality_control.py @src/usher_pipeline/scoring/validation.py Task 1: Visualization module with matplotlib/seaborn plots src/usher_pipeline/output/visualizations.py pyproject.toml tests/test_visualizations.py **pyproject.toml**: Add matplotlib and seaborn to dependencies list: - "matplotlib>=3.8.0" - "seaborn>=0.13.0"

visualizations.py: Create visualization module with 3 plot functions and 1 orchestrator.

Use matplotlib backend "Agg" (non-interactive, safe for headless/CLI use): call matplotlib.use("Agg") before importing pyplot.

  1. plot_score_distribution(df: pl.DataFrame, output_path: Path) -> Path:

    • Converts to pandas via df.to_pandas() (small result set, acceptable overhead per research)
    • Sets seaborn theme: sns.set_theme(style="whitegrid", context="paper")
    • Creates histogram of composite_score colored by confidence_tier
    • Uses sns.histplot(data=pdf, x="composite_score", hue="confidence_tier", hue_order=["HIGH", "MEDIUM", "LOW"], palette={"HIGH": "#2ecc71", "MEDIUM": "#f39c12", "LOW": "#e74c3c"}, bins=30, multiple="stack")
    • Labels: x="Composite Score", y="Candidate Count", title="Score Distribution by Confidence Tier"
    • Saves as PNG at 300 DPI with bbox_inches='tight'
    • CRITICAL: Always call plt.close(fig) after savefig (memory leak pitfall from research)
    • Returns output_path
  2. plot_layer_contributions(df: pl.DataFrame, output_path: Path) -> Path:

    • Counts non-null values per layer score column: gnomad_score, expression_score, annotation_score, localization_score, animal_model_score, literature_score
    • Creates bar chart using seaborn barplot with viridis palette
    • X-axis labels cleaned (remove "_score" suffix), rotated 45 degrees
    • Labels: x="Evidence Layer", y="Candidates with Evidence", title="Evidence Layer Coverage"
    • Saves PNG at 300 DPI, closes figure
    • Returns output_path
  3. plot_tier_breakdown(df: pl.DataFrame, output_path: Path) -> Path:

    • Counts genes per confidence_tier
    • Creates pie chart with percentage labels (autopct='%1.1f%%')
    • Colors match score_distribution palette (green/orange/red for HIGH/MEDIUM/LOW)
    • Title: "Candidate Tier Breakdown"
    • Saves PNG at 300 DPI, closes figure
    • Returns output_path
  4. generate_all_plots(df: pl.DataFrame, output_dir: Path) -> dict[str, Path]:

    • Creates output_dir if not exists
    • Calls all 3 plot functions with standard filenames: score_distribution.png, layer_contributions.png, tier_breakdown.png
    • Returns dict mapping plot name to file path
    • Wraps each plot in try/except so one failure doesn't block others (log warning on failure)

tests/test_visualizations.py: Test file creation.

Create synthetic DataFrame fixture with ~30 rows including confidence_tier and all 6 layer score columns (some NULL).

Tests:

  1. test_plot_score_distribution_creates_file: Verify PNG file created and size > 0
  2. test_plot_layer_contributions_creates_file: Verify PNG file created
  3. test_plot_tier_breakdown_creates_file: Verify PNG file created
  4. test_generate_all_plots_creates_all_files: Verify all 3 PNG files exist in output_dir
  5. test_generate_all_plots_returns_paths: Verify returned dict has 3 entries
  6. test_plots_handle_empty_dataframe: Empty DataFrame produces plots without crashing (edge case) Run: cd /Users/gbanyan/Project/usher-exploring && python -m pytest tests/test_visualizations.py -v All 6 visualization tests pass. PNG files are created at 300 DPI. Plots handle edge cases (empty data, all-NULL columns) without crashing. matplotlib figures are properly closed after saving.
Task 2: Reproducibility report module with JSON and Markdown output src/usher_pipeline/output/reproducibility.py src/usher_pipeline/output/__init__.py tests/test_reproducibility.py **reproducibility.py**: Create reproducibility report generation module.

Define FilteringStep dataclass:

  • step_name: str
  • input_count: int
  • output_count: int
  • criteria: str

Define ReproducibilityReport dataclass:

  • run_id: str (UUID4)
  • timestamp: str (ISO format)
  • pipeline_version: str
  • parameters: dict (scoring weights, thresholds, etc.)
  • data_versions: dict (ensembl_release, gnomad_version, gtex_version, hpa_version)
  • software_environment: dict (python version, polars version, duckdb version, etc.)
  • filtering_steps: list[FilteringStep]
  • validation_metrics: dict (from validation.py output if available)
  • tier_statistics: dict (total, high, medium, low counts)

Methods on ReproducibilityReport:

  • to_json(path: Path) -> Path: Write as indented JSON file
  • to_markdown(path: Path) -> Path: Write as human-readable Markdown with tables for filtering steps, parameters section, software versions, tier statistics, validation metrics
  • to_dict() -> dict: Return as plain dict for programmatic access

Implement generate_reproducibility_report(config: PipelineConfig, tiered_df: pl.DataFrame, provenance: ProvenanceTracker, validation_result: dict | None = None) -> ReproducibilityReport:

  • Extracts parameters from config (scoring weights via config.scoring.model_dump(), data_versions via config.versions.model_dump())
  • Computes tier_statistics from tiered_df confidence_tier column
  • Builds filtering_steps from provenance.get_steps() -- each recorded step with gene counts
  • Captures software versions: sys.version, polars.version, duckdb.version
  • Generates UUID4 run_id
  • If validation_result provided, includes median_percentile, top_quartile_fraction, validation_passed
  • Returns ReproducibilityReport instance

Update init.py: Add generate_reproducibility_report, ReproducibilityReport, generate_all_plots, and individual plot functions to exports. Also add visualizations imports.

tests/test_reproducibility.py: Test report content.

Create mock config, mock provenance tracker, and synthetic tiered DataFrame.

Tests:

  1. test_generate_report_has_all_fields: Report contains run_id, timestamp, pipeline_version, parameters, data_versions, software_environment, tier_statistics
  2. test_report_to_json_parseable: Write JSON, read back with json.load, verify it's valid JSON with expected keys
  3. test_report_to_markdown_has_headers: Markdown output contains "# Pipeline Reproducibility Report", "## Parameters", "## Data Versions", "## Filtering Steps", "## Tier Statistics"
  4. test_report_tier_statistics_match: tier_statistics.total == tiered_df.height, high + medium + low == total
  5. test_report_includes_validation_when_provided: When validation_result dict is passed, report contains validation_metrics section
  6. test_report_without_validation: When validation_result is None, report still generates without error
  7. test_report_software_versions: software_environment contains python, polars, duckdb keys Run: cd /Users/gbanyan/Project/usher-exploring && python -m pytest tests/test_reproducibility.py -v All 7 reproducibility tests pass. Report generates in both JSON and Markdown formats. JSON is valid and parseable. Markdown contains all required sections with proper formatting. Tier statistics are accurate. Validation metrics are optional and handled gracefully.
- `python -c "from usher_pipeline.output.visualizations import generate_all_plots; print('viz OK')"` succeeds - `python -c "from usher_pipeline.output.reproducibility import generate_reproducibility_report, ReproducibilityReport; print('report OK')"` succeeds - `python -m pytest tests/test_visualizations.py tests/test_reproducibility.py -v` -- all tests pass - matplotlib Agg backend used (no display required)

<success_criteria>

  • Visualization module produces 3 PNG plots (score distribution, layer contributions, tier breakdown) at 300 DPI
  • Reproducibility report module generates both JSON and Markdown formats with parameters, data versions, filtering steps, tier statistics, and optional validation metrics
  • All tests pass
  • No matplotlib display window opened (Agg backend) </success_criteria>
After completion, create `.planning/phases/05-output-cli/05-02-SUMMARY.md`