tooluniverse-clinical-guidelines
This skill searches and retrieves clinical practice guidelines from over 12 authoritative sources including NICE, WHO, NCCN, AHA, ADA, SIGN, USPSTF, and IDSA to provide evidence-graded treatment recommendations, dosing protocols, and screening guidance. Use it when you need to find current, authoritative clinical guidance prioritized by source credibility, verify recommendation strength, or confirm whether newer guideline updates exist for a specific condition or treatment decision.
git clone --depth 1 https://github.com/mims-harvard/ToolUniverse /tmp/tooluniverse-clinical-guidelines && cp -r /tmp/tooluniverse-clinical-guidelines/plugin/skills/tooluniverse-clinical-guidelines ~/.claude/skills/tooluniverse-clinical-guidelinesSKILL.md
# Clinical Guidelines Search & Retrieval ## Guideline Hierarchy Not all guidelines carry equal weight. Evaluate sources in this order: 1. **NICE and WHO** — Evidence-graded, regularly updated, rigorous systematic review process. NICE guidelines include explicit recommendation strength (e.g., "offer" vs "consider"). 2. **Society guidelines (AHA, ADA, NCCN, SIGN)** — Expert-consensus panels within a specialty. May lag behind the latest evidence by 1-3 years. Strong within their domain but narrower scope. 3. **Aggregator databases (GIN, TRIP, OpenAlex)** — Index guidelines from multiple societies. Good for breadth and discovery, but you must verify the original source. 4. **Literature databases (PubMed, EuropePMC)** — Return guideline-related publications, not curated guideline text. Useful as a fallback, not a primary source. **Always check publication date.** A 2015 guideline may be superseded by a 2024 update. When presenting results, include the year prominently and note if newer guidance may exist. --- ## COMPUTE, DON'T DESCRIBE When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it. ## Search Strategy ### Step 1: Start Narrow, Then Broaden 1. Search the **condition name + "guideline"** in NICE, TRIP, and GIN simultaneously (parallel calls). 2. If the question targets a specialty, add the society tool: AHA for cardiology, ADA for diabetes, NCCN for oncology, CPIC for pharmacogenomics. 3. If initial searches return nothing, broaden to the disease category (e.g., "heart failure" instead of "HFpEF with SGLT2 inhibitors"). 4. If society-specific tools fail, fall back to PubMed/EuropePMC with `[condition] guideline [year]`. ### Step 2: Search at Least 3 Sources Always query a minimum of 3 databases to catch guidelines that one source may miss. Prioritize: **NICE > GIN > TRIP > Society-specific > Literature databases**. ### Step 3: Retrieve Full Text When Available After identifying relevant guidelines from search results, use full-text tools to get recommendation details before synthesizing. --- ## Diagnostic Test Selection Reasoning When a clinical question asks "which test should be ordered?" or "what is the most appropriate next diagnostic step?", apply this reasoning framework BEFORE searching guidelines. ### Step 1: What Is the Clinical Question Actually Asking? Diagnostic tests serve different purposes. Identify which one the question demands: - **Screening**: Detect disease in an asymptomatic population. Prioritize SENSITIVITY (minimize false negatives). Example: ANA for SLE screening. - **Confirmation**: Confirm a suspected diagnosis. Prioritize SPECIFICITY (minimize false positives). Example: anti-dsDNA or anti-Smith for SLE confirmation. - **Differentiation**: Distinguish between two diagnoses that look similar. Choose the test that is POSITIVE in one and NEGATIVE in the other. Example: ASO titers to distinguish PSGN from SLE nephritis (both have low complement and hematuria, but only PSGN has elevated ASO). - **Staging/Prognosis**: Determine disease severity after diagnosis is established. Example: renal biopsy ISN/RPS class for lupus nephritis. - **Monitoring**: Track response to treatment. Example: anti-dsDNA titers and complement levels in SLE. ### Step 2: Match the Test to the Diagnostic Gap Ask: "What piece of information am I MISSING that would change management?" ### Step 3: Sensitivity vs Specificity Decision Matrix | Scenario | Prioritize | Reasoning | |----------|-----------|-----------| | Ruling OUT a dangerous condition | High sensitivity | A negative result reliably excludes the disease | | Confirming before invasive treatment | High specificity | A positive result reliably confirms the disease | | Differentiating two similar conditions | Test unique to one | Choose marker present in condition A but absent in condition B | | Emergency with life-threatening DDx | Fastest available test | Speed trumps perfect accuracy in acute settings | ### Step 4: Common Test Selection Pitfalls 1. **Ordering a test that is positive in BOTH conditions on the differential** — C3/C4 is low in both SLE and PSGN; it does not differentiate. Always ask: "Would this test result change my differential?" 2. **Ordering a screening test when a confirmatory test is needed** — ANA is sensitive but not specific for SLE. If you already suspect SLE, order anti-dsDNA or anti-Smith (specific). 3. **Skipping the simple test for the exotic one** — ASO titers are cheap and fast. Do not jump to renal biopsy before checking whether streptococcal infection explains the presentation. 4. **Forgetting temporal context** — PSGN complement normalizes in 6-8 weeks; SLE complement stays persistently low. A single complement level is less useful than a trend. 5. **Ignoring pre-test probability** — A test with 95% specificity still has a 50% false-positive rate if the pre-test probability is only 5%. Consider the clinical picture first. --- ## Lab Test Interpretation Strategy - Always consider **pre-test probability** before interpreting any result. A positive test in a low-prevalence population has a high false-positive rate regardless of test accuracy. - **SnNOut**: A highly **Se**nsitive test, when **N**egative, rules **Out** the disease. Use sensitive tests for screening. - **SpPIn**: A highly **Sp**ecific test, when **P**ositive, rules **In** the disease. Use specific tests for confirmation. - For **conflicting results** (e.g., one test positive, another negative): repeat the discordant test, order a different confirmatory test, or re-evaluate the clinical picture and pre-test probability. - **Likelihood ratios** trump sensitivity/specificity alone. LR+ >10 or LR- <0.1 meaningfully shift post-test probability. --- ## Surgical Decision Mak
Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".
Systematic ACMG/AMP germline variant classification with all 28 criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7) for clinical significance. Produces 5-tier verdict (Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign) with cited evidence per criterion. Use for variant interpretation, VUS resolution, and pathogenicity assessment. Combines ClinVar, gnomAD, computational predictors, and gene-mechanism context.
Comprehensive ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) profiling for drug candidates. Integrates ADMET-AI predictions, SwissADME drug-likeness, PubChemTox experimental toxicity, ChEMBL clinical data, Lipinski rule-of-five, and CYP interaction data. Use for drug-likeness assessment, BBB penetration, bioavailability, hepatotoxicity prediction, ADME/PK profiling, or screening compound libraries before lab testing.
Detect and analyze adverse drug event signals using FDA FAERS reports, drug labels, and disproportionality statistics (PRR, ROR, IC). Generates quantitative safety signal scores (0-100) with evidence grading. Use for post-market surveillance, pharmacovigilance, drug safety assessment, regulatory submissions, and detecting rare AE signals not visible in clinical trials.
Map environmental and industrial chemicals to adverse outcome pathways (AOPs) — molecular initiating event to organ-level toxicity. Uses AOPWiki, GHS classification, IARC carcinogen status, and LD50 data. Use for environmental/industrial chemical risk assessment, regulatory-grade hazard characterization, and AOP stressor mapping. Distinct from drug-safety analysis (use tooluniverse-pharmacovigilance for drugs).
Aging biology, cellular senescence, and longevity research. Covers senescence markers (p16/CDKN2A, SASP, SA-beta-gal), aging hallmarks, senolytic drug discovery (dasatinib+quercetin, fisetin, navitoclax), epigenetic clocks, telomere biology, and longevity GWAS. Use for senescence-pathway analysis, age-related disease genetics, senolytic-target discovery, and centenarian-genetics queries. Distinguishes correlative vs causal evidence (knockout, intervention).
Therapeutic antibody engineering and optimization, lead-to-clinical-candidate. Covers sequence humanization (germline alignment, framework retention), affinity maturation, developability (aggregation, stability, PTMs), structure modeling (AlphaFold/PDB CDR analysis), immunogenicity prediction, and manufacturing feasibility. Use for biologic-drug optimization, mAb design review, biosimilar engineering, and clinical-precedent comparison.
Discover novel small-molecule binders for protein targets using structure-based and ligand-based screening. Covers druggability assessment, known-ligand mining (ChEMBL, BindingDB), similarity expansion, ADMET filtering, and synthesis feasibility. Use for hit identification, virtual screening, target-to-compounds workflows, and lead-finding before commit-to-medchem.