Skip to main content
ClaudeWave
Skill1.4k repo starsupdated today

tooluniverse-literature-deep-research

This Claude Code skill conducts systematic literature reviews across PubMed, EuropePMC, and bioRxiv by disambiguating research questions, executing collision-aware database searches, grading evidence on a T1-T4 scale, and generating structured reports with source attribution. Use it when conducting meta-analyses, synthesizing evidence for systematic reviews, or producing detailed answers that require comprehensive citation networks and quality-ranked sources.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/mims-harvard/ToolUniverse /tmp/tooluniverse-literature-deep-research && cp -r /tmp/tooluniverse-literature-deep-research/plugin/skills/tooluniverse-literature-deep-research ~/.claude/skills/tooluniverse-literature-deep-research
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Literature Deep Research

Systematic literature research: disambiguate, search with collision-aware queries, grade evidence, produce structured reports.

**KEY PRINCIPLES**: (1) Disambiguate first (2) Right-size deliverable (3) Grade every claim T1-T4 (4) All sections mandatory even if "limited evidence" (5) Source attribution for every claim (6) English-first queries, respond in user's language (7) Report = deliverable, not search log

---

## LOOK UP, DON'T GUESS

Search PubMed/EuropePMC FIRST before reasoning. A published paper beats memory.

**Factoid search strategy:**
1. Extract KEY TERMS (most specific nouns/verbs)
2. `EuropePMC_search_articles(query="term1 term2 term3", limit=5)`
3. No results -> BROADEN (remove most restrictive term)
4. Too many -> NARROW (add specific terms)
5. Answer usually in abstract of top results
6. Failed query -> try DIFFERENT TERMS/synonyms, don't repeat

---

## COMPUTE, DON'T DESCRIBE
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.

## Workflow

```
Phase 0: Clarify + Mode Select → Phase 1: Disambiguate + Profile → Phase 2: Literature Search → Phase 3: Report
```

---

## Phase 0: Mode Selection

| Mode | When | Deliverable |
|------|------|-------------|
| **Factoid** | Single concrete question | 1-page fact-check report + bibliography |
| **Mini-review** | Narrow topic | 1-3 page narrative |
| **Full Deep-Research** | Comprehensive overview | 15-section report + bibliography |

### Factoid Mode (Fast Path)
```markdown
# [TOPIC]: Fact-check Report
## Question / ## Answer (with evidence rating) / ## Source(s) / ## Verification Notes / ## Limitations
```

### Domain Detection

| Pattern | Domain | Action |
|---------|--------|--------|
| Gene/protein symbol | Biological target | Full bio disambiguation |
| Drug name | Drug | Drug disambiguation (1.5) |
| Disease name | Disease | Disease disambiguation (1.6) |
| CS/ML topic | General academic | Skip bio tools, literature-only |
| Cross-domain | Interdisciplinary | Resolve each entity in its domain |

### Cross-Skill Delegation
- Gene/protein deep-dive: `tooluniverse-target-research`
- Drug profile: `tooluniverse-drug-research`
- Disease profile: `tooluniverse-disease-research`

Use this skill for **literature synthesis**. Use specialized skills for **entity profiling**. For max depth, run both.

---

## Phase 1: Subject Disambiguation + Profile

### 1.1 Biological Target Resolution
```
UniProt_search → UniProt_get_entry_by_accession → UniProt_id_mapping
ensembl_lookup_gene → MyGene_get_gene_annotation
```

### 1.2 Naming Collision Detection
Check first 20 results. If >20% off-topic, build negative filter: `NOT [collision1] NOT [collision2]`.
Gene family: `"ADAR" NOT "ADAR2" NOT "ADARB1"`. Cross-domain: add context terms.

### 1.3 Baseline Profile (Bio Targets)
```
InterPro_get_protein_domains, UniProt_get_ptm_processing_by_accession, HPA_get_subcellular_location,
GTEx_get_median_gene_expression, GO_get_annotations_for_gene, Reactome_map_uniprot_to_pathways,
STRING_get_protein_interactions, intact_get_interactions, OpenTargets_get_target_tractability_by_ensemblID
```
GPCR targets: delegate to `tooluniverse-target-research`.

### 1.5 Drug Disambiguation
**Identity**: `OpenTargets_get_drug_chembId_by_generic_name`, `ChEMBL_get_drug`, `PubChem_get_CID_by_compound_name`, `drugbank_get_drug_basic_info_by_drug_name_or_id`
**Targets**: `ChEMBL_get_drug_mechanisms`, `OpenTargets_get_associated_targets_by_drug_chemblId`, `DGIdb_get_drug_gene_interactions`
**Safety**: `OpenTargets_get_drug_adverse_events_by_chemblId`, `OpenTargets_get_drug_indications_by_chemblId`, `search_clinical_trials`

### 1.6 Disease Disambiguation
```
OpenTargets disease search → EFO/MONDO IDs
DisGeNET_get_disease_genes, DisGeNET_search_disease
CTD_get_disease_chemicals
```

### 1.7 Compound Queries (e.g., "metformin in breast cancer")
Resolve both entities, then cross-reference via CTD_get_chemical_gene_interactions, CTD_get_chemical_diseases, OpenTargets drug-target/drug-disease tools. Intersect shared targets/pathways.

### 1.8 General Academic / 1.9 Interdisciplinary
Non-bio: skip bio tools, use ArXiv/DBLP/OSF. Cross-domain: resolve bio entities with 1.1-1.3, search CS/general in parallel, merge and cross-reference.

---

## Phase 2: Literature Search

**Methodology stays internal. Report shows findings, not process.**

### 2.1 Query Strategy
**Step 1: Seeds** (15-30 core papers): domain-specific title searches with date/sort filters.
**Step 2: Citation expansion**: `PubMed_get_cited_by`, `EuropePMC_get_citations/references`, `PubMed_get_related`, `SemanticScholar_get_recommendations`, `OpenCitations_get_citations`
**Step 3: Collision-filtered broader queries**: `"[TERM]" AND ([context]) NOT [collision]`

### 2.2 Literature Tools — core set + adaptive by domain

Run the **core multi-field set on every review** (catches what any single index misses), then add the domain rows that match the subject. Don't fire every source blindly — 6–10 well-chosen indexes beat 20 noisy ones.

**ALWAYS run (core, all disciplines)**: `PubMed_search_articles`, `EuropePMC_search_articles`, `openalex_search_works` (query param `search`/`query`) **or** `openalex_literature_search` (query param `search_keywords`) — pick one and match its param; mixing them silently returns off-topic results — and `SemanticScholar_search_papers`

**Then add by domain:**

| Domain | Add these | Notes |
|--------|-----------|-------|
| Biomedical / clinical | `PMC_search_papers` (full text), `PubTator3_LiteratureSearch` (entity & `relations:` queries), `PubMed_Guidelines_Search` (clinical guidelines) | PubTator normalizes gene/drug/disease entities |
| Biology (ecology/evolution/plant) | **EuropePMC as PRIMARY** + OpenAlex
setup-tooluniverseSkill

Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".

tooluniverse-acmg-variant-classificationSkill

Systematic ACMG/AMP germline variant classification with all 28 criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7) for clinical significance. Produces 5-tier verdict (Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign) with cited evidence per criterion. Use for variant interpretation, VUS resolution, and pathogenicity assessment. Combines ClinVar, gnomAD, computational predictors, and gene-mechanism context.

tooluniverse-admet-predictionSkill

Comprehensive ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) profiling for drug candidates. Integrates ADMET-AI predictions, SwissADME drug-likeness, PubChemTox experimental toxicity, ChEMBL clinical data, Lipinski rule-of-five, and CYP interaction data. Use for drug-likeness assessment, BBB penetration, bioavailability, hepatotoxicity prediction, ADME/PK profiling, or screening compound libraries before lab testing.

tooluniverse-adverse-event-detectionSkill

Detect and analyze adverse drug event signals using FDA FAERS reports, drug labels, and disproportionality statistics (PRR, ROR, IC). Generates quantitative safety signal scores (0-100) with evidence grading. Use for post-market surveillance, pharmacovigilance, drug safety assessment, regulatory submissions, and detecting rare AE signals not visible in clinical trials.

tooluniverse-adverse-outcome-pathwaySkill

Map environmental and industrial chemicals to adverse outcome pathways (AOPs) — molecular initiating event to organ-level toxicity. Uses AOPWiki, GHS classification, IARC carcinogen status, and LD50 data. Use for environmental/industrial chemical risk assessment, regulatory-grade hazard characterization, and AOP stressor mapping. Distinct from drug-safety analysis (use tooluniverse-pharmacovigilance for drugs).

tooluniverse-aging-senescenceSkill

Aging biology, cellular senescence, and longevity research. Covers senescence markers (p16/CDKN2A, SASP, SA-beta-gal), aging hallmarks, senolytic drug discovery (dasatinib+quercetin, fisetin, navitoclax), epigenetic clocks, telomere biology, and longevity GWAS. Use for senescence-pathway analysis, age-related disease genetics, senolytic-target discovery, and centenarian-genetics queries. Distinguishes correlative vs causal evidence (knockout, intervention).

tooluniverse-antibody-engineeringSkill

Therapeutic antibody engineering and optimization, lead-to-clinical-candidate. Covers sequence humanization (germline alignment, framework retention), affinity maturation, developability (aggregation, stability, PTMs), structure modeling (AlphaFold/PDB CDR analysis), immunogenicity prediction, and manufacturing feasibility. Use for biologic-drug optimization, mAb design review, biosimilar engineering, and clinical-precedent comparison.

tooluniverse-binder-discoverySkill

Discover novel small-molecule binders for protein targets using structure-based and ligand-based screening. Covers druggability assessment, known-ligand mining (ChEMBL, BindingDB), similarity expansion, ADMET filtering, and synthesis feasibility. Use for hit identification, virtual screening, target-to-compounds workflows, and lead-finding before commit-to-medchem.