feat(01-02): create mapping validation gates with tests

- Add MappingValidator with configurable success rate thresholds (min_success_rate, warn_threshold)
- Add validate_gene_universe for gene count, format, and duplicate checks
- Add save_unmapped_report for manual review output
- Implement 15 comprehensive tests with mocked mygene responses (no real API calls)
- Tests cover: successful mapping, notfound handling, uniprot list parsing, batching, validation gates, universe validation
This commit is contained in:
2026-02-11 16:33:36 +08:00
parent 98a1a750dd
commit 0200395d9e
3 changed files with 560 additions and 0 deletions

View File

@@ -13,6 +13,11 @@ from usher_pipeline.gene_mapping.universe import (
fetch_protein_coding_genes,
GeneUniverse,
)
from usher_pipeline.gene_mapping.validator import (
MappingValidator,
ValidationResult,
validate_gene_universe,
)
__all__ = [
"GeneMapper",
@@ -20,4 +25,7 @@ __all__ = [
"MappingReport",
"fetch_protein_coding_genes",
"GeneUniverse",
"MappingValidator",
"ValidationResult",
"validate_gene_universe",
]