- Add trio_analysis.py for trio-based variant analysis with de novo detection - Add clinvar_acmg_annotate.py for ClinVar/ACMG annotation - Add gwas_comprehensive.py with 201 SNPs across 18 categories - Add pharmgkb_full_analysis.py for pharmacogenomics analysis - Add gwas_trait_lookup.py for basic GWAS trait lookup - Add pharmacogenomics.py for basic PGx analysis - Remove unused scaffolding code (src/, configs/, docs/, tests/) - Update README.md with new documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Genomic Consultant
A practical genomics analysis toolkit for Trio WES (Whole Exome Sequencing) data analysis, including ClinVar/ACMG annotation, GWAS trait analysis, and pharmacogenomics.
Analysis Scripts
1. Trio Analysis (trio_analysis.py)
Comprehensive trio-based variant analysis with de novo detection, compound heterozygosity, and inheritance pattern annotation.
python trio_analysis.py <vcf_path> <output_dir>
2. ClinVar/ACMG Annotation (clinvar_acmg_annotate.py)
Annotates variants with ClinVar clinical significance and generates ACMG-style evidence tags.
python clinvar_acmg_annotate.py <vcf_path> <output_path> [sample_idx]
3. GWAS Comprehensive Analysis (gwas_comprehensive.py)
Comprehensive GWAS trait analysis with 201 curated SNPs across 18 categories:
- Gout / Uric acid metabolism
- Kidney disease
- Hearing loss
- Autoimmune diseases
- Cancer risk
- Blood clotting / Thrombosis
- Thyroid disorders
- Bone health / Osteoporosis
- Liver disease (NAFLD)
- Migraine
- Longevity / Aging
- Sleep
- Skin conditions
- Cardiovascular disease
- Metabolic disorders
- Eye conditions
- Neuropsychiatric
- Other traits
python gwas_comprehensive.py <vcf_path> <output_path> [sample_idx]
4. PharmGKB Full Analysis (pharmgkb_full_analysis.py)
Comprehensive pharmacogenomics analysis using the PharmGKB clinical annotations database.
python pharmgkb_full_analysis.py <vcf_path> <output_path> [sample_idx]
5. GWAS Trait Lookup (gwas_trait_lookup.py)
Original curated GWAS trait lookup (smaller SNP set).
python gwas_trait_lookup.py <vcf_path> <output_path> [sample_idx]
6. Basic Pharmacogenomics (pharmacogenomics.py)
Basic pharmacogenomics analysis with common drug-gene interactions.
Prerequisites
- Python 3.8+
- conda environment with bioinformatics tools:
conda create -n genomics python=3.10 conda activate genomics conda install -c bioconda bcftools snpeff gatk4
Reference Databases Required
- ClinVar: VCF from NCBI
- PharmGKB: Clinical annotations TSV
- dbSNP: For rsID annotation
- GRCh37/hg19 reference genome
Data Directory Structure
/Volumes/NV2/
├── genomics_analysis/
│ └── vcf/
│ ├── trio_joint.vcf.gz # Joint-called VCF
│ ├── trio_joint.rsid.vcf.gz # With rsID annotations
│ └── trio_joint.snpeff.vcf # With SnpEff annotations
└── genomics_reference/
├── clinvar/
├── pharmgkb/
├── dbsnp/
└── gwas_catalog/
Sample Index Mapping
For trio VCF files:
- Index 0: Mother
- Index 1: Father
- Index 2: Proband
Output Reports
Each script generates detailed reports including:
- Summary statistics
- Risk variant identification
- Family comparison (for trio data)
- Clinical annotations and recommendations
License
Private use only.