Skip to main content
ClaudeWave
Skill1.4k repo starsupdated today

tooluniverse-metabolomics-analysis

This skill performs end-to-end metabolomics analysis on mass spectrometry and NMR data, covering metabolite identification against reference databases, quantification with normalization options (TIC or internal standard), batch effect correction, and statistical differential analysis. Use it when analyzing LC-MS or GC-MS metabolomics datasets to identify dysregulated metabolites, perform pathway enrichment, discover biomarkers, or integrate metabolomics with other omics layers such as transcriptomics.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/mims-harvard/ToolUniverse /tmp/tooluniverse-metabolomics-analysis && cp -r /tmp/tooluniverse-metabolomics-analysis/plugin/skills/tooluniverse-metabolomics-analysis ~/.claude/skills/tooluniverse-metabolomics-analysis
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Metabolomics Analysis

Comprehensive analysis of metabolomics data from metabolite identification through quantification, statistical analysis, pathway interpretation, and integration with other omics layers.

## Domain Reasoning

Metabolomics quantification depends critically on normalization. Total ion current (TIC) normalization corrects for sample-loading variation and works well for global abundance changes; internal standard normalization is more accurate for targeted analysis where specific metabolite concentrations matter. Missing values in a peak table may reflect signal below the detection limit — not true absence — and should be imputed or handled explicitly rather than treated as zero. Failing to account for batch effects across instrument runs is a frequent source of spurious differential metabolites.

## LOOK UP DON'T GUESS

- Metabolite identities: use `Metabolite_search` and `Metabolite_get_info` to confirm names, CIDs, and HMDB IDs; never assume identity from m/z alone.
- Pathway memberships: query KEGG, MetaCyc, or Reactome tools; do not list pathways from memory.
- Disease associations: retrieve from CTD via `Metabolite_get_diseases`; do not infer clinical relevance without database evidence.
- CV thresholds and QC criteria: apply the values defined in this workflow (CV < 30%, blank ratio > 3x); do not override with guesses.

---

## When to Use This Skill

**Triggers**:
- User has metabolomics data (LC-MS, GC-MS, NMR)
- Questions about metabolite abundance or concentrations
- Differential metabolite analysis requests
- Metabolic pathway analysis
- Multi-omics integration with metabolomics
- Metabolic biomarker discovery
- Flux balance analysis or metabolic modeling
- Metabolite-enzyme correlation

**Example Questions**:
1. "Analyze this LC-MS metabolomics data for differential metabolites"
2. "Which metabolic pathways are dysregulated between conditions?"
3. "Identify metabolite biomarkers for disease classification"
4. "Correlate metabolite levels with enzyme expression"
5. "Perform pathway enrichment for differential metabolites"
6. "Integrate metabolomics with transcriptomics data"

---

## Core Capabilities

| Capability | Description |
|-----------|-------------|
| **Data Import** | LC-MS, GC-MS, NMR, targeted/untargeted platforms |
| **Metabolite Identification** | Match to HMDB, KEGG, PubChem, spectral libraries |
| **Quality Control** | Peak quality, blank subtraction, internal standard normalization |
| **Normalization** | Probabilistic quotient, total ion current, internal standards |
| **Statistical Analysis** | Univariate and multivariate (PCA, PLS-DA, OPLS-DA) |
| **Differential Analysis** | Identify significant metabolite changes |
| **Pathway Enrichment** | KEGG, Reactome, BioCyc metabolic pathway analysis |
| **Metabolite-Enzyme Integration** | Correlate with expression data |
| **Flux Analysis** | Metabolic flux balance analysis (FBA) |
| **Biomarker Discovery** | Multi-metabolite signatures |

---

## Workflow Overview

```
Input: Metabolomics Data (Peak Table or Spectra)
    |
    v
Phase 1: Data Import & Metabolite Identification
    |-- Load peak table or process raw spectra
    |-- Match features to HMDB, KEGG (accurate mass +/- 5 ppm)
    |-- Confidence scoring (Level 1-4)
    |
    v
Phase 2: Quality Control & Filtering
    |-- CV in QC samples (<30%)
    |-- Blank subtraction (sample/blank > 3)
    |-- Remove features with >50% missing
    |
    v
Phase 3: Normalization
    |-- Sample-wise: TIC, PQN, or internal standards
    |-- Transformation: log2, Pareto, or auto-scaling
    |-- Batch effect correction (if multi-batch)
    |
    v
Phase 4: Exploratory Analysis
    |-- PCA for sample clustering
    |-- PLS-DA for supervised separation
    |-- Outlier detection
    |
    v
Phase 5: Differential Analysis
    |-- t-test / ANOVA / Wilcoxon
    |-- Fold change + FDR correction
    |-- Volcano plots, heatmaps
    |
    v
Phase 6: Pathway Analysis
    |-- Metabolite set enrichment (MSEA)
    |-- KEGG/Reactome pathway mapping
    |-- Pathway topology (hub/bottleneck metabolites)
    |
    v
Phase 7: Multi-Omics Integration
    |-- Metabolite-enzyme Spearman correlation
    |-- Pathway-level concordance scoring
    |-- Metabolic flux inference
    |
    v
Phase 8: Generate Report
    |-- Summary statistics, differential metabolites
    |-- Pathway diagrams, biomarker panel
```

---

## Phase Summaries

### Phase 1: Data Import & Identification
Load peak tables (CSV/TSV) or process raw spectra (mzML). Match features to HMDB by accurate mass (+/- 5 ppm). Assign confidence levels: L1 (standard match), L2 (MS/MS), L3 (mass only), L4 (unknown).

### Phase 2: Quality Control
Assess CV in QC samples (reject >30%), compute blank ratios (keep >3x blank), filter features with >50% missing values. Check internal standard recovery (95-105% acceptable).

### Phase 3: Normalization
Three methods available: TIC (simple, assumes similar total abundance), PQN (robust to large changes, recommended), Internal Standard (most accurate with spiked standards). Follow with log2 transform or Pareto scaling.

### Phase 4: Exploratory Analysis
PCA reveals sample grouping and batch effects. PLS-DA provides supervised separation (report R2 and Q2 for model quality). Flag and investigate outliers.

### Phase 5: Differential Analysis
Welch's t-test (two groups) or ANOVA (multiple groups) with Benjamini-Hochberg FDR correction. Significance thresholds: adj. p < 0.05 and |log2FC| > 1.0.

### Phase 6: Pathway Analysis
Map differential metabolites to KEGG compound IDs. Perform MSEA for pathway enrichment. Consider topology: metabolites at pathway hubs (high degree/betweenness centrality) have greater impact.

### Phase 7: Multi-Omics Integration
Correlate metabolite levels with enzyme expression (Spearman). Expected: substrate-enzyme negative correlation (consumption), product-enzyme positive correlation (production). Score pathway dysregulation using combined metabolite + gene ev
setup-tooluniverseSkill

Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".

tooluniverse-acmg-variant-classificationSkill

Systematic ACMG/AMP germline variant classification with all 28 criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7) for clinical significance. Produces 5-tier verdict (Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign) with cited evidence per criterion. Use for variant interpretation, VUS resolution, and pathogenicity assessment. Combines ClinVar, gnomAD, computational predictors, and gene-mechanism context.

tooluniverse-admet-predictionSkill

Comprehensive ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) profiling for drug candidates. Integrates ADMET-AI predictions, SwissADME drug-likeness, PubChemTox experimental toxicity, ChEMBL clinical data, Lipinski rule-of-five, and CYP interaction data. Use for drug-likeness assessment, BBB penetration, bioavailability, hepatotoxicity prediction, ADME/PK profiling, or screening compound libraries before lab testing.

tooluniverse-adverse-event-detectionSkill

Detect and analyze adverse drug event signals using FDA FAERS reports, drug labels, and disproportionality statistics (PRR, ROR, IC). Generates quantitative safety signal scores (0-100) with evidence grading. Use for post-market surveillance, pharmacovigilance, drug safety assessment, regulatory submissions, and detecting rare AE signals not visible in clinical trials.

tooluniverse-adverse-outcome-pathwaySkill

Map environmental and industrial chemicals to adverse outcome pathways (AOPs) — molecular initiating event to organ-level toxicity. Uses AOPWiki, GHS classification, IARC carcinogen status, and LD50 data. Use for environmental/industrial chemical risk assessment, regulatory-grade hazard characterization, and AOP stressor mapping. Distinct from drug-safety analysis (use tooluniverse-pharmacovigilance for drugs).

tooluniverse-aging-senescenceSkill

Aging biology, cellular senescence, and longevity research. Covers senescence markers (p16/CDKN2A, SASP, SA-beta-gal), aging hallmarks, senolytic drug discovery (dasatinib+quercetin, fisetin, navitoclax), epigenetic clocks, telomere biology, and longevity GWAS. Use for senescence-pathway analysis, age-related disease genetics, senolytic-target discovery, and centenarian-genetics queries. Distinguishes correlative vs causal evidence (knockout, intervention).

tooluniverse-antibody-engineeringSkill

Therapeutic antibody engineering and optimization, lead-to-clinical-candidate. Covers sequence humanization (germline alignment, framework retention), affinity maturation, developability (aggregation, stability, PTMs), structure modeling (AlphaFold/PDB CDR analysis), immunogenicity prediction, and manufacturing feasibility. Use for biologic-drug optimization, mAb design review, biosimilar engineering, and clinical-precedent comparison.

tooluniverse-binder-discoverySkill

Discover novel small-molecule binders for protein targets using structure-based and ligand-based screening. Covers druggability assessment, known-ligand mining (ChEMBL, BindingDB), similarity expansion, ADMET filtering, and synthesis feasibility. Use for hit identification, virtual screening, target-to-compounds workflows, and lead-finding before commit-to-medchem.