Skip to main content
ClaudeWave
Skill1.4k estrellas del repoactualizado today

tooluniverse-chemical-sourcing

This skill resolves chemical compound identity through PubChem and ChEMBL, then searches commercial vendor databases including ZINC, Enamine, eMolecules, and Mcule to identify sources, compare pricing and availability, and locate purchasable analogs when exact compounds are unavailable. Use it for chemical procurement, virtual library curation, and identifying where to purchase specific compounds for synthesis planning.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/mims-harvard/ToolUniverse /tmp/tooluniverse-chemical-sourcing && cp -r /tmp/tooluniverse-chemical-sourcing/plugin/skills/tooluniverse-chemical-sourcing ~/.claude/skills/tooluniverse-chemical-sourcing
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Chemical Compound Sourcing & Procurement

Pipeline for identifying, sourcing, and purchasing chemical compounds from commercial vendors. Resolves compound identity through PubChem/ChEMBL, searches multiple vendor databases (ZINC, Enamine, eMolecules, Mcule), compares pricing and availability, and identifies purchasable analogs when exact compounds are unavailable.

**Guiding principles**:
1. **Identity first** -- confirm the compound's structure (SMILES, InChI) before searching vendors; names can be ambiguous
2. **Multi-vendor comparison** -- always check multiple sources; pricing and stock vary significantly
3. **Analog fallback** -- if the exact compound is unavailable, search for close analogs
4. **Purity and quantity awareness** -- note catalog purity grades and minimum order quantities
5. **Structure over name** -- vendor searches by SMILES/InChI are more reliable than name searches
6. **English-first queries** -- use English compound names in tool calls

## LOOK UP, DON'T GUESS
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.

---

## COMPUTE, DON'T DESCRIBE
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.

## When to Use

Typical triggers:
- "Where can I buy [compound]?"
- "Find commercial sources for [SMILES]"
- "Compare prices for [compound] across vendors"
- "Is [compound] commercially available?"
- "Find purchasable analogs of [compound]"
- "I need [quantity] of [compound] -- who sells it?"
- "Search ZINC/Enamine for [compound]"

**Not this skill**: For ADMET/toxicity assessment, use `tooluniverse-admet-prediction`. For drug-target interaction analysis, use `tooluniverse-drug-target-validation`.

---

## Core Databases

| Database | Scope | Best For |
|----------|-------|----------|
| **ZINC** | 230M+ purchasable compounds; aggregates vendors | Broadest coverage; substructure/similarity search; free |
| **Enamine** | ~4M in-stock, 30B+ REAL (make-on-demand) | Large in-stock library; fast delivery; building blocks |
| **eMolecules** | Multi-vendor aggregator; 8M+ compounds | Cross-vendor comparison; pricing transparency |
| **Mcule** | 40M+ compounds; one-stop purchasing | Integrated ordering; quote generation |
| **PubChem** | 110M+ compounds; identity resolution | Authoritative compound identification; CID lookup |
| **ChEMBL** | 2.4M+ bioactive molecules | Bioactivity context for sourced compounds |

---

## Workflow Overview

```
Phase 0: Compound Identity Resolution
  Name/SMILES/CAS -> PubChem CID -> canonical SMILES
    |
Phase 1: Vendor Search
  Query ZINC, Enamine, eMolecules, Mcule
    |
Phase 2: Price & Availability Comparison
  Catalog numbers, pricing, stock status, purity
    |
Phase 3: Analog Search (if needed)
  Similarity search for purchasable alternatives
    |
Phase 4: Bioactivity Context (optional)
  ChEMBL activity data for sourced compounds
    |
Phase 5: Order Summary
  Consolidated vendor comparison table
```

---

## Phase Details

### Phase 0: Compound Identity Resolution

**Objective**: Establish unambiguous compound identity before vendor searches.

**Tools**:
- `PubChem_get_CID_by_compound_name` -- resolve name to CID
  - Input: `name` (compound name)
  - Output: `{IdentifierList: {CID: [...]}}`
- `PubChem_get_compound_properties_by_CID` -- get SMILES, MW, formula
  - Input: `cid` (PubChem CID), `properties` (comma-separated list)
  - Output: `{CID, MolecularWeight, ConnectivitySMILES, IUPACName}`
- `ChEMBL_get_molecule` -- get ChEMBL compound details
  - Input: `molecule_chembl_id` (ChEMBL ID) or search by name
  - Output: SMILES, molecular properties, synonyms

**Workflow**:
1. If user provides a name: resolve to PubChem CID, then get SMILES
2. If user provides SMILES: use directly (optionally verify via PubChem)
3. If user provides CAS number: search PubChem by name (CAS numbers work as search terms)
4. Record: canonical SMILES, molecular weight, molecular formula, IUPAC name

**Important**: PubChem `ConnectivitySMILES` (not `CanonicalSMILES`) is the correct property name. Always confirm the SMILES matches the intended compound before proceeding.

### Phase 1: Vendor Search

**Objective**: Search all available vendor databases for the target compound.

**Tools**:
- `ZINC_search_compounds` -- search ZINC by name or SMILES
  - Input: `query` (name or SMILES), optional `catalog`, `limit`
  - Output: ZINC IDs, vendor info, purchasability status
- `ZINC_get_compound` -- get detailed compound info from ZINC
  - Input: `zinc_id` (ZINC identifier)
  - Output: vendors, catalogs, pricing, SMILES
- `Enamine_search_catalog` -- search Enamine catalog
  - Input: `query` (name or SMILES), optional `catalog_type`, `limit`
  - Output: catalog numbers, availability, pricing
- `Enamine_get_compound` -- get Enamine compound details
  - Input: `compound_id` (Enamine catalog number)
  - Output: structure, pricing, stock status, delivery time
- `eMolecules_search` -- search across multiple vendors
  - Input: `query` (name or SMILES), optional `limit`
  - Output: vendor list, catalog numbers, pricing
- `eMolecules_get_compound` -- get eMolecules compound details
  - Input: `compound_id` (eMolecules ID)
  - Output: vendors, pricing tiers, purity
- `Mcule_get_compound` -- search Mcule database
  - Input: `query` (name or SMILES), optional `limit`
  - Output: Mcule IDs, availability, pricing
- `Mcule_get_compound` -- get Mcule compound details
  - Input: `compound_id` (Mcule ID)
  - Output: pricing, delivery, purity, catalog number

**Workflow**:
1. Search all four vendor databases in parallel using SMILES (preferred) or name
2. For each hit, retrieve detailed compound info (pricing, stock, purity)
3. De
setup-tooluniverseSkill

Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".

tooluniverse-acmg-variant-classificationSkill

Systematic ACMG/AMP germline variant classification with all 28 criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7) for clinical significance. Produces 5-tier verdict (Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign) with cited evidence per criterion. Use for variant interpretation, VUS resolution, and pathogenicity assessment. Combines ClinVar, gnomAD, computational predictors, and gene-mechanism context.

tooluniverse-admet-predictionSkill

Comprehensive ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) profiling for drug candidates. Integrates ADMET-AI predictions, SwissADME drug-likeness, PubChemTox experimental toxicity, ChEMBL clinical data, Lipinski rule-of-five, and CYP interaction data. Use for drug-likeness assessment, BBB penetration, bioavailability, hepatotoxicity prediction, ADME/PK profiling, or screening compound libraries before lab testing.

tooluniverse-adverse-event-detectionSkill

Detect and analyze adverse drug event signals using FDA FAERS reports, drug labels, and disproportionality statistics (PRR, ROR, IC). Generates quantitative safety signal scores (0-100) with evidence grading. Use for post-market surveillance, pharmacovigilance, drug safety assessment, regulatory submissions, and detecting rare AE signals not visible in clinical trials.

tooluniverse-adverse-outcome-pathwaySkill

Map environmental and industrial chemicals to adverse outcome pathways (AOPs) — molecular initiating event to organ-level toxicity. Uses AOPWiki, GHS classification, IARC carcinogen status, and LD50 data. Use for environmental/industrial chemical risk assessment, regulatory-grade hazard characterization, and AOP stressor mapping. Distinct from drug-safety analysis (use tooluniverse-pharmacovigilance for drugs).

tooluniverse-aging-senescenceSkill

Aging biology, cellular senescence, and longevity research. Covers senescence markers (p16/CDKN2A, SASP, SA-beta-gal), aging hallmarks, senolytic drug discovery (dasatinib+quercetin, fisetin, navitoclax), epigenetic clocks, telomere biology, and longevity GWAS. Use for senescence-pathway analysis, age-related disease genetics, senolytic-target discovery, and centenarian-genetics queries. Distinguishes correlative vs causal evidence (knockout, intervention).

tooluniverse-antibody-engineeringSkill

Therapeutic antibody engineering and optimization, lead-to-clinical-candidate. Covers sequence humanization (germline alignment, framework retention), affinity maturation, developability (aggregation, stability, PTMs), structure modeling (AlphaFold/PDB CDR analysis), immunogenicity prediction, and manufacturing feasibility. Use for biologic-drug optimization, mAb design review, biosimilar engineering, and clinical-precedent comparison.

tooluniverse-binder-discoverySkill

Discover novel small-molecule binders for protein targets using structure-based and ligand-based screening. Covers druggability assessment, known-ligand mining (ChEMBL, BindingDB), similarity expansion, ADMET filtering, and synthesis feasibility. Use for hit identification, virtual screening, target-to-compounds workflows, and lead-finding before commit-to-medchem.