Skip to main content
ClaudeWave
Skill1.4k estrellas del repoactualizado today

tooluniverse-epigenomics-chromatin

This skill analyzes histone modifications, chromatin accessibility, and transcription factor binding using public databases including ENCODE, Roadmap Epigenomics, and ChIP-Atlas. Use it to identify regulatory elements at specific genomic regions, determine which transcription factors bind particular loci, assess how variants affect regulatory landscapes, or map element-to-gene relationships through eQTL data. It provides tools for chromatin-state classification, cCRE annotation, and multi-layer evidence synthesis for interpreting regulatory variant effects.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/mims-harvard/ToolUniverse /tmp/tooluniverse-epigenomics-chromatin && cp -r /tmp/tooluniverse-epigenomics-chromatin/plugin/skills/tooluniverse-epigenomics-chromatin ~/.claude/skills/tooluniverse-epigenomics-chromatin
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Epigenomics and Chromatin Accessibility Research

## NOT for (use other skills instead)

- Methylation array data processing (CpG beta values, differential methylation) -> Use `tooluniverse-epigenomics`
- RNA-seq differential expression -> Use `tooluniverse-rnaseq-deseq2`
- GWAS variant interpretation -> Use `tooluniverse-gwas-snp-interpretation`
- Variant functional annotation from VCF -> Use `tooluniverse-variant-analysis`

---

## Reasoning: Classify the Question First

Before calling any tool, identify which question type you're answering. Each maps to a different tool set.

**(a) Which regulatory elements exist at a locus?**
Use UCSC_get_encode_cCREs (region-based) or SCREEN_get_regulatory_elements (gene-based). Then check ENCODE_get_chromatin_state for ChromHMM annotation and ENCODE_search_chromatin_accessibility for ATAC-seq evidence.

**(b) Which TFs bind there?**
Use ReMap_get_transcription_factor_binding for ChIP-seq experiments. Use jaspar_search_matrices to retrieve binding motifs and check whether the sequence disrupts a known motif.

**(c) How does a variant affect regulation?**
Use RegulomeDB_query_variant for a scored summary. Then build multi-layer evidence: UCSC_get_encode_cCREs (is the variant in a cCRE?), GTEx_get_single_tissue_eqtls (is it an eQTL?), jaspar_search_matrices (does it disrupt a TF motif?). No single layer is sufficient — see the variant reasoning section below.

**(d) What genes are regulated by an element?**
Use GTEx_get_single_tissue_eqtls or GTEx_query_eqtl to find genes whose expression is associated with variants in the element. Use SCREEN_get_regulatory_elements with element_type="PLS"/"pELS"/"dELS" to classify element-to-promoter relationships.

---

## Reasoning: Histone Marks

Use histone mark identity to guide tool queries and interpret results before fetching data.

- **H3K4me3** = active promoter. If present without H3K27ac, promoter may be active but not hyperacetylated.
- **H3K27ac** = active enhancer or promoter. Strong signal = regulatory element is on.
- **H3K4me1** = poised or active enhancer. Needs H3K27ac to confirm activity; H3K4me1 alone = poised.
- **H3K27me3** = Polycomb repression. Gene is silenced by PRC2.
- **H3K9me3** = constitutive heterochromatin. Region is structurally silenced.
- **H3K36me3** = transcribed gene body. Confirms active elongation.

**Bivalent promoter logic**: If you observe H3K4me3 + H3K27me3 together at the same locus, the promoter is bivalent — poised but not active. This is common in stem cells and developmentally regulated genes. Do not report such genes as "actively transcribed." Use GTEx_get_expression_summary to check if the gene is actually expressed in the tissue of interest.

**Inference rule**: If a user asks about a mark you haven't queried yet, ask: does the mark you *have* found already answer the question? H3K4me3 in a region predicts active transcription; you may not need to also query H3K36me3 unless confirming elongation specifically.

---

## Reasoning: eQTL Interpretation

An eQTL means variant X is statistically associated with expression of gene Y in tissue T. Before reporting eQTL results, apply this chain of reasoning:

1. **Association ≠ causation.** The variant may be in LD with the causal variant. Report effect size (NES) and p-value, not causality.
2. **Check tissue specificity.** Use GTEx_get_multi_tissue_eqtls to see whether the effect is shared across tissues (m-value near 1.0 in many tissues) or tissue-specific (m-value near 1.0 in only one tissue). Tissue-specific eQTLs are stronger candidates for cell-type-specific regulation.
3. **Cross-reference with chromatin.** Is the eQTL variant inside a cCRE? Use UCSC_get_encode_cCREs on the variant's coordinates. If yes, the variant likely acts through a regulatory element.
4. **Check TF motif disruption.** Use jaspar_search_matrices to find motifs overlapping the eQTL locus. If the variant alleles differ in motif score, it is a candidate causal variant.
5. **Effect direction matters.** Positive NES = reference allele increases expression. Negative NES = alternative allele decreases expression.

---

## Reasoning: Variant Regulatory Impact

To assess a non-coding variant's regulatory impact, build evidence from multiple independent layers. No single layer is sufficient.

**Layer 1 — RegulomeDB score**: High probability (score 1a–2b) means convergent evidence from eQTL + TF binding + DNase. Score 4–7 means weak support. Use as a triage filter.

**Layer 2 — Regulatory element overlap**: Query UCSC_get_encode_cCREs at the variant's coordinates. If the variant falls in a cCRE (especially PLS or pELS), it is in a functional context.

**Layer 3 — eQTL evidence**: Query GTEx_get_single_tissue_eqtls for nearby genes. If the variant is a significant eQTL, the association supports regulatory function.

**Layer 4 — TFBS disruption**: Query jaspar_search_matrices for TFs with motifs at the locus. If the variant changes a high-information-content position in a motif, it is a strong functional candidate.

**Synthesis rule**: Report each layer separately. Convergence across 3+ layers = high-confidence regulatory variant. A single layer (e.g., eQTL alone) warrants caution.

---

## Phase 0: Disambiguation

**MyGene_query_genes**: `query` (string). Converts gene symbols to Ensembl IDs and coordinates. Filter results by `symbol == '<GENE>'` — first hit may not match.

**ensembl_lookup_gene**: `gene_id` (Ensembl ID), `species` (REQUIRED, "homo_sapiens"). Returns chr/start/end.

Key format notes:
- GTEx requires versioned GENCODE IDs: `ENSG00000012048.20`
- RegulomeDB takes rsIDs: `rs4994`
- GTEx variant IDs: `chr17_43705621_T_C_b38`
- UCSC cCRE regions: `chrom="chr17", start=7668421, end=7687490`

---

## Phase 1: Histone Modification & ChIP-seq

**ENCODE_search_histone_experiments**: `target` (histone mark), `cell_type` (or `tissue` alias), `biosample_term_name` (most explicit ENCODE ontology name), `limit`.

ENCODE anatomy term notes: "breast" →
setup-tooluniverseSkill

Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".

tooluniverse-acmg-variant-classificationSkill

Systematic ACMG/AMP germline variant classification with all 28 criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7) for clinical significance. Produces 5-tier verdict (Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign) with cited evidence per criterion. Use for variant interpretation, VUS resolution, and pathogenicity assessment. Combines ClinVar, gnomAD, computational predictors, and gene-mechanism context.

tooluniverse-admet-predictionSkill

Comprehensive ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) profiling for drug candidates. Integrates ADMET-AI predictions, SwissADME drug-likeness, PubChemTox experimental toxicity, ChEMBL clinical data, Lipinski rule-of-five, and CYP interaction data. Use for drug-likeness assessment, BBB penetration, bioavailability, hepatotoxicity prediction, ADME/PK profiling, or screening compound libraries before lab testing.

tooluniverse-adverse-event-detectionSkill

Detect and analyze adverse drug event signals using FDA FAERS reports, drug labels, and disproportionality statistics (PRR, ROR, IC). Generates quantitative safety signal scores (0-100) with evidence grading. Use for post-market surveillance, pharmacovigilance, drug safety assessment, regulatory submissions, and detecting rare AE signals not visible in clinical trials.

tooluniverse-adverse-outcome-pathwaySkill

Map environmental and industrial chemicals to adverse outcome pathways (AOPs) — molecular initiating event to organ-level toxicity. Uses AOPWiki, GHS classification, IARC carcinogen status, and LD50 data. Use for environmental/industrial chemical risk assessment, regulatory-grade hazard characterization, and AOP stressor mapping. Distinct from drug-safety analysis (use tooluniverse-pharmacovigilance for drugs).

tooluniverse-aging-senescenceSkill

Aging biology, cellular senescence, and longevity research. Covers senescence markers (p16/CDKN2A, SASP, SA-beta-gal), aging hallmarks, senolytic drug discovery (dasatinib+quercetin, fisetin, navitoclax), epigenetic clocks, telomere biology, and longevity GWAS. Use for senescence-pathway analysis, age-related disease genetics, senolytic-target discovery, and centenarian-genetics queries. Distinguishes correlative vs causal evidence (knockout, intervention).

tooluniverse-antibody-engineeringSkill

Therapeutic antibody engineering and optimization, lead-to-clinical-candidate. Covers sequence humanization (germline alignment, framework retention), affinity maturation, developability (aggregation, stability, PTMs), structure modeling (AlphaFold/PDB CDR analysis), immunogenicity prediction, and manufacturing feasibility. Use for biologic-drug optimization, mAb design review, biosimilar engineering, and clinical-precedent comparison.

tooluniverse-binder-discoverySkill

Discover novel small-molecule binders for protein targets using structure-based and ligand-based screening. Covers druggability assessment, known-ligand mining (ChEMBL, BindingDB), similarity expansion, ADMET filtering, and synthesis feasibility. Use for hit identification, virtual screening, target-to-compounds workflows, and lead-finding before commit-to-medchem.