Skill1.6k repo starsupdated today

tooluniverse-infectious-disease

This skill enables rapid identification of drug repurposing candidates and essential therapeutic targets for emerging pathogens by integrating pathogen genomics data from NCBI and BVBRC, host immune response information from IEDB, drug-target interactions from ChEMBL and DGIdb, and published literature from PubMed. Use it when responding to disease outbreaks, characterizing novel pathogens, or identifying antiviral candidates for emerging infectious threats.

View source Repository: ToolUniverse

Install in Claude Code

Copy

git clone --depth 1 https://github.com/mims-harvard/ToolUniverse /tmp/tooluniverse-infectious-disease && cp -r /tmp/tooluniverse-infectious-disease/plugin/skills/tooluniverse-infectious-disease ~/.claude/skills/tooluniverse-infectious-disease

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

## COMPUTE, DON'T DESCRIBE
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.

# Infectious Disease Outbreak Intelligence

Rapid response system for emerging pathogens using taxonomy analysis, target identification, structure prediction, and computational drug repurposing.

**KEY PRINCIPLES**:
1. **Speed is critical** - Optimize for rapid actionable intelligence
2. **Target essential proteins** - Focus on conserved, essential viral/bacterial proteins
3. **Leverage existing drugs** - Prioritize FDA-approved compounds for repurposing
4. **Structure-guided** - Use NvidiaNIM for rapid structure prediction and docking
5. **Evidence-graded** - Grade repurposing candidates by evidence strength
6. **Actionable output** - Prioritized drug candidates with rationale
7. **English-first queries** - Always use English terms in tool calls; respond in user's language

**REASONING STRATEGY — Start Here**:
Start with pathogen identification: What type of organism? (virus, bacteria, fungus, parasite). Then ask:
- What are the essential proteins? (required for replication or viability — cannot be mutated away)
- Which are surface-exposed? (accessible to drugs and antibodies)
- Which are conserved across strains? (targeting conserved regions prevents resistance escape)
These three questions define your drug targets and vaccine candidates. Organisms in the same genus share targets — look up drug precedent for related pathogens before predicting from scratch.

**LOOK UP DON'T GUESS**: Never assume a pathogen's taxonomy, genome size, or protein function. Always call `BVBRC_search_taxonomy` or `UniProt_search` first. Even well-known pathogens have strains with different drug susceptibility profiles — look up the specific strain when known.

---

## When to Use

Apply when user asks:
- "New pathogen detected - what drugs might work?"
- "Emerging virus [X] - therapeutic options?"
- "Drug repurposing candidates for [pathogen]"
- "What do we know about [novel coronavirus/bacteria]?"
- "Essential targets in [pathogen] for drug development"
- "Can we repurpose [drug] against [pathogen]?"

---

## Critical Workflow Requirements

### 1. Report-First Approach (MANDATORY)

1. Create `[PATHOGEN]_outbreak_intelligence.md` FIRST with section headers
2. Progressively update as data is gathered
3. Output separate files: `[PATHOGEN]_drug_candidates.csv`, `[PATHOGEN]_target_proteins.csv`

### 2. Citation Requirements (MANDATORY)

Every finding must have inline source attribution:
```markdown
### Target: RNA-dependent RNA polymerase (RdRp)
- **UniProt**: P0DTD1 (NSP12)
- **Essentiality**: Required for replication
*Source: UniProt via `UniProt_search`, literature review*
```

---

## Phase 0: Tool Verification

### Known Parameter Corrections

| Tool | WRONG Parameter | CORRECT Parameter |
|------|-----------------|-------------------|
| `NCBIDatasets_get_taxonomy` | `name` | `tax_id` (integer) or use `BVBRC_search_taxonomy` for keyword search |
| `UniProt_search` | `name` | `query` |
| `ChEMBL_search_targets` | `query`, `target` | `pref_name__contains` (substring match) |
| `get_diffdock_info` | `protein_file` | `protein` (content) |
| `drugbank_full_search` | _(may fail)_ | Use `drugbank_vocab_search` as primary DrugBank lookup |

> **PubMed tip**: Use `sort="relevance"` (default) not `sort="pub_date"` — date-sorted queries can return empty for narrow topics. Tool name: `PubMed_search_articles`.
> **FDA labels**: Use `FDA_get_drug_label_info_by_field_value` with targeted `return_fields` to avoid oversized responses from `OpenFDA_search_drug_labels`.

---

## Workflow Overview

```
Phase 1: Pathogen Identification
├── Taxonomic classification (NCBI Taxonomy)
├── Closest relatives (for knowledge transfer)
├── Genome/proteome availability
└── OUTPUT: Pathogen profile
    |
Phase 2: Target Identification
├── Essential genes/proteins (UniProt)
├── Conservation across strains
├── Druggability assessment (ChEMBL)
└── OUTPUT: Prioritized target list (scored by essentiality/conservation/druggability/precedent)
    |
Phase 3: Structure Prediction (NvidiaNIM)
├── AlphaFold2/ESMFold for targets
├── Binding site identification
├── Quality assessment (pLDDT)
└── OUTPUT: Target structures (docking-ready if pLDDT > 70)
    |
Phase 4: Drug Repurposing Screen
├── Approved drugs for related pathogens (ChEMBL)
├── Broad-spectrum antivirals/antibiotics
├── Docking screen (get_diffdock_info)
└── OUTPUT: Ranked candidate drugs
    |
Phase 4.5: Pathway Analysis
├── KEGG: Pathogen metabolism pathways
├── Essential metabolic targets
├── Host-pathogen interaction pathways
└── OUTPUT: Pathway-based drug targets
    |
Phase 5: Literature Intelligence
├── PubMed: Published outbreak reports
├── BioRxiv/MedRxiv: Recent preprints (CRITICAL for outbreaks)
├── ArXiv: Computational/ML preprints
├── OpenAlex: Citation tracking
├── ClinicalTrials.gov: Active trials
└── OUTPUT: Evidence synthesis
    |
Phase 6: Report Synthesis
├── Top drug candidates with evidence grades
├── Clinical trial opportunities
├── Recommended immediate actions
└── OUTPUT: Final report
```

---

## Phase Summaries

### Phase 1: Pathogen Identification
Classify via NCBI Taxonomy (query param). Identify related pathogens with existing drugs for knowledge transfer. Determine genome/proteome availability.

**Genome assembly availability and QC**: After classifying the pathogen, use `NCBIDatasets_list_genomes_by_taxon` (params `taxon` as tax_id, `limit`, `reference_only`) to find the reference genome, `NCBIDatasets_get_genome_assembly` (param `accession`, e.g. "GCF_000005845.2") for assembly metrics (length, N50, GC%, contig/chromosome counts), and `NCBIDatasets_get_sequence_reports` (param `accession`) to map replicons (chromosomes/plasmids wit

More from this repository

setup-tooluniverseSkill

Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".

tooluniverse-acmg-variant-classificationSkill

Systematic ACMG/AMP germline variant classification with all 28 criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7) for clinical significance. Produces 5-tier verdict (Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign) with cited evidence per criterion. Use for variant interpretation, VUS resolution, and pathogenicity assessment. Combines ClinVar, gnomAD, computational predictors, and gene-mechanism context.

tooluniverse-admet-predictionSkill

Comprehensive ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) profiling for drug candidates. Integrates ADMET-AI predictions, SwissADME drug-likeness, PubChemTox experimental toxicity, ChEMBL clinical data, Lipinski rule-of-five, and CYP interaction data. Use for drug-likeness assessment, BBB penetration, bioavailability, hepatotoxicity prediction, ADME/PK profiling, or screening compound libraries before lab testing.

tooluniverse-adverse-event-detectionSkill

Detect and analyze adverse drug event signals using FDA FAERS reports, drug labels, and disproportionality statistics (PRR, ROR, IC). Generates quantitative safety signal scores (0-100) with evidence grading. Use for post-market surveillance, pharmacovigilance, drug safety assessment, regulatory submissions, and detecting rare AE signals not visible in clinical trials.

tooluniverse-adverse-outcome-pathwaySkill

Map environmental and industrial chemicals to adverse outcome pathways (AOPs) — molecular initiating event to organ-level toxicity. Uses AOPWiki, GHS classification, IARC carcinogen status, and LD50 data. Use for environmental/industrial chemical risk assessment, regulatory-grade hazard characterization, and AOP stressor mapping. Distinct from drug-safety analysis (use tooluniverse-pharmacovigilance for drugs).

tooluniverse-aging-senescenceSkill

Aging biology, cellular senescence, and longevity research. Covers senescence markers (p16/CDKN2A, SASP, SA-beta-gal), aging hallmarks, senolytic drug discovery (dasatinib+quercetin, fisetin, navitoclax), epigenetic clocks, telomere biology, and longevity GWAS. Use for senescence-pathway analysis, age-related disease genetics, senolytic-target discovery, and centenarian-genetics queries. Distinguishes correlative vs causal evidence (knockout, intervention).

tooluniverse-antibody-engineeringSkill

Therapeutic antibody engineering and optimization, lead-to-clinical-candidate. Covers sequence humanization (germline alignment, framework retention), affinity maturation, developability (aggregation, stability, PTMs), structure modeling (AlphaFold/PDB CDR analysis), immunogenicity prediction, and manufacturing feasibility. Use for biologic-drug optimization, mAb design review, biosimilar engineering, and clinical-precedent comparison.

tooluniverse-binder-discoverySkill

Discover novel small-molecule binders for protein targets using structure-based and ligand-based screening. Covers druggability assessment, known-ligand mining (ChEMBL, BindingDB), similarity expansion, ADMET filtering, and synthesis feasibility. Use for hit identification, virtual screening, target-to-compounds workflows, and lead-finding before commit-to-medchem.