tooluniverse-multiomic-disease-characterization
This skill performs comprehensive disease analysis by integrating genomic, transcriptomic, proteomic, and pathway data to reveal molecular mechanisms at the systems level. Use it when characterizing disease biology across multiple molecular layers, identifying therapeutic targets and biomarker candidates, or generating mechanistic hypotheses that require evidence graded from human studies to computational predictions with multi-omics confidence scoring.
git clone --depth 1 https://github.com/mims-harvard/ToolUniverse /tmp/tooluniverse-multiomic-disease-characterization && cp -r /tmp/tooluniverse-multiomic-disease-characterization/plugin/skills/tooluniverse-multiomic-disease-characterization ~/.claude/skills/tooluniverse-multiomic-disease-characterizationSKILL.md
# Multi-Omics Disease Characterization Pipeline Characterize diseases across multiple molecular layers (genomics, transcriptomics, proteomics, pathways) to provide systems-level understanding of disease mechanisms, identify therapeutic opportunities, and discover biomarker candidates. **KEY PRINCIPLES**: 1. **Report-first approach** - Create report file FIRST, then populate progressively 2. **Disease disambiguation FIRST** - Resolve all identifiers before omics analysis 3. **Layer-by-layer analysis** - Systematically cover all omics layers 4. **Cross-layer integration** - Identify genes/targets appearing in multiple layers 5. **Evidence grading** - Grade all evidence as T1 (human/clinical) to T4 (computational) 6. **Tissue context** - Emphasize disease-relevant tissues/organs 7. **Quantitative scoring** - Multi-Omics Confidence Score (0-100) 8. **Druggable focus** - Prioritize targets with therapeutic potential 9. **Biomarker identification** - Highlight diagnostic/prognostic markers 10. **Mechanistic synthesis** - Generate testable hypotheses 11. **Source references** - Every statement must cite tool/database 12. **Completeness checklist** - Mandatory section showing analysis coverage 13. **English-first queries** - Always use English terms in tool calls. Respond in user's language Multi-omics disease characterization asks: what molecular layers are dysregulated? Genomic mutations → transcriptomic changes → proteomic effects → metabolomic consequences. Concordance across layers strengthens the finding. Discordance reveals regulatory complexity. ## LOOK UP, DON'T GUESS When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess. --- ## COMPUTE, DON'T DESCRIBE When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it. ## When to Use This Skill Apply when users: - Ask about disease mechanisms across omics layers - Need multi-omics characterization of a disease - Want to understand disease at the systems biology level - Ask "What pathways/genes/proteins are involved in [disease]?" - Need biomarker discovery for a disease - Want to identify druggable targets from disease profiling - Ask for integrated genomics + transcriptomics + proteomics analysis - Need cross-layer concordance analysis - Ask about disease network biology / hub genes **NOT for** (use other skills instead): - Single gene/target validation -> Use `tooluniverse-drug-target-validation` - Drug safety profiling -> Use `tooluniverse-adverse-event-detection` - General disease overview -> Use `tooluniverse-disease-research` - Variant interpretation -> Use `tooluniverse-variant-interpretation` - GWAS-specific analysis -> Use `tooluniverse-gwas-*` skills - Pathway-only analysis -> Use `tooluniverse-systems-biology` --- ## Input Parameters | Parameter | Required | Description | Example | |-----------|----------|-------------|---------| | **disease** | Yes | Disease name, OMIM ID, EFO ID, or MONDO ID | `Alzheimer disease`, `MONDO_0004975` | | **tissue** | No | Tissue/organ of interest | `brain`, `liver`, `blood` | | **focus_layers** | No | Specific omics layers to emphasize | `genomics`, `transcriptomics`, `pathways` | --- ## Pipeline Overview The pipeline runs 9 phases sequentially. Each phase uses specific tools documented in detail in `tool-reference.md`. ### Phase 0: Disease Disambiguation (ALWAYS FIRST) Resolve disease to standard identifiers (MONDO/EFO) for all downstream queries. - Primary tool: `OpenTargets_get_disease_id_description_by_name` - Get description, synonyms, therapeutic areas, disease hierarchy, cross-references - **CRITICAL**: Disease IDs use underscore format (e.g., `MONDO_0004975`), NOT colon - If ambiguous, present top 3-5 options and ask user to select ### Phase 1: Genomics Layer Identify genetic variants, GWAS associations, and genetically implicated genes. - Tools: `gwas_search_associations` (use `efo_id` for precision, not free-text `disease_trait`), `gwas_get_snps_for_gene`, ClinVar, OpenTargets associated targets - `gnomad_get_gene_constraints` — gene constraint metrics (pLI, oe_lof) to interpret whether LoF variants are tolerated vs. haploinsufficient - Get top 10-15 genes with genetic evidence scores; track Ensembl IDs for downstream phases ### Phase 2: Transcriptomics Layer Identify differentially expressed genes, tissue-specific expression, and expression-based biomarkers. - `GTEx_get_expression_summary` — baseline expression across 54 tissues (accepts `gene_symbol` directly) - Tools: Expression Atlas, HPA (tissue expression), EuropePMC scores - Check expression in disease-relevant tissues for top genes from Phase 1 ### Phase 3: Proteomics & Interaction Layer Map protein-protein interactions, identify hub genes, and characterize interaction networks. - `UniProt_get_function_by_accession` — protein function narrative (essential for mechanistic context) - Tools: `STRING_get_network` (param: `identifiers`, `species`=9606), `intact_get_interactions`, HumanBase - Build PPI network from top 15-20 genes; identify hub genes by degree centrality ### Phase 4: Pathway & Network Layer Identify enriched biological pathways and cross-pathway connections. - `ReactomeAnalysis_pathway_enrichment` — identifiers are **newline-separated** (`\n`), NOT space-separated - `enrichr_gene_enrichment_analysis` — param: `gene_list` (array), `libs` (array). NOTE: `data` field is a JSON string that needs parsing - `kegg_search_pathway` — pathway keyword search ### Phase 5: Gene Ontology & Functional Annotation Characterize biological processes, molecular functions, and cellular components. - Tools: Enrichr (GO libraries), QuickGO, GO annotations, OpenTargets GO - Run
Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".
Systematic ACMG/AMP germline variant classification with all 28 criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7) for clinical significance. Produces 5-tier verdict (Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign) with cited evidence per criterion. Use for variant interpretation, VUS resolution, and pathogenicity assessment. Combines ClinVar, gnomAD, computational predictors, and gene-mechanism context.
Comprehensive ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) profiling for drug candidates. Integrates ADMET-AI predictions, SwissADME drug-likeness, PubChemTox experimental toxicity, ChEMBL clinical data, Lipinski rule-of-five, and CYP interaction data. Use for drug-likeness assessment, BBB penetration, bioavailability, hepatotoxicity prediction, ADME/PK profiling, or screening compound libraries before lab testing.
Detect and analyze adverse drug event signals using FDA FAERS reports, drug labels, and disproportionality statistics (PRR, ROR, IC). Generates quantitative safety signal scores (0-100) with evidence grading. Use for post-market surveillance, pharmacovigilance, drug safety assessment, regulatory submissions, and detecting rare AE signals not visible in clinical trials.
Map environmental and industrial chemicals to adverse outcome pathways (AOPs) — molecular initiating event to organ-level toxicity. Uses AOPWiki, GHS classification, IARC carcinogen status, and LD50 data. Use for environmental/industrial chemical risk assessment, regulatory-grade hazard characterization, and AOP stressor mapping. Distinct from drug-safety analysis (use tooluniverse-pharmacovigilance for drugs).
Aging biology, cellular senescence, and longevity research. Covers senescence markers (p16/CDKN2A, SASP, SA-beta-gal), aging hallmarks, senolytic drug discovery (dasatinib+quercetin, fisetin, navitoclax), epigenetic clocks, telomere biology, and longevity GWAS. Use for senescence-pathway analysis, age-related disease genetics, senolytic-target discovery, and centenarian-genetics queries. Distinguishes correlative vs causal evidence (knockout, intervention).
Therapeutic antibody engineering and optimization, lead-to-clinical-candidate. Covers sequence humanization (germline alignment, framework retention), affinity maturation, developability (aggregation, stability, PTMs), structure modeling (AlphaFold/PDB CDR analysis), immunogenicity prediction, and manufacturing feasibility. Use for biologic-drug optimization, mAb design review, biosimilar engineering, and clinical-precedent comparison.
Discover novel small-molecule binders for protein targets using structure-based and ligand-based screening. Covers druggability assessment, known-ligand mining (ChEMBL, BindingDB), similarity expansion, ADMET filtering, and synthesis feasibility. Use for hit identification, virtual screening, target-to-compounds workflows, and lead-finding before commit-to-medchem.