bioinformatics
This Claude Code skill performs comprehensive bioinformatics analyses including pathway enrichment, gene ontology annotation, protein-protein interaction network construction, multi-omics data integration, and biological sequence database queries. Use this skill when users need functional annotation of gene sets, pathway analysis for omics datasets, single-cell RNA-seq analysis with cell type identification, or integration of multiple omics layers to identify convergent biological mechanisms.
git clone --depth 1 https://github.com/beita6969/ScienceClaw /tmp/bioinformatics && cp -r /tmp/bioinformatics/skills/bioinformatics ~/.claude/skills/bioinformaticsSKILL.md
## When to Trigger Activate this skill when the user mentions: - Pathway analysis, KEGG, Reactome, WikiPathways - Gene Ontology (GO) enrichment, biological process, molecular function - Protein-protein interaction (PPI) networks, STRING, BioGRID - Multi-omics integration (transcriptomics + proteomics + metabolomics) - Gene set enrichment analysis (GSEA), over-representation analysis (ORA) - Sequence databases, UniProt, NCBI, Ensembl queries - Single-cell RNA-seq analysis, clustering, trajectory inference ## Step-by-Step Methodology 1. **Data preparation** - Standardize gene/protein identifiers (convert to Entrez, Ensembl, or UniProt IDs as needed). Remove duplicates and handle ambiguous mappings. Verify organism and genome build. 2. **Differential analysis** - For transcriptomics: DESeq2 or edgeR (count data), limma-voom (normalized). For proteomics: limma with appropriate normalization. Apply multiple testing correction (BH-FDR). Set thresholds (|log2FC| > 1, padj < 0.05 as defaults, adjustable). 3. **Functional enrichment** - Perform GO enrichment (BP, MF, CC) using clusterProfiler, g:Profiler, or DAVID. Run KEGG/Reactome pathway enrichment. Use GSEA for ranked gene lists (no arbitrary cutoff). Report enriched terms with gene ratio, p-value, adjusted p-value, and gene members. 4. **Network analysis** - Build PPI networks from STRING (confidence > 0.7 for high confidence). Identify hub genes (degree centrality), bottleneck nodes (betweenness centrality), and functional modules (MCODE, Louvain clustering). Overlay expression data on network. 5. **Multi-omics integration** - For paired omics: correlation analysis, canonical correlation (CCA), or MOFA/DIABLO. Map features across omics layers using shared identifiers or known biological connections. Identify convergent pathways. 6. **Single-cell analysis** - QC filtering (genes/cell, UMI/cell, mitochondrial %). Normalization (scran, SCTransform). Dimensionality reduction (PCA, UMAP). Clustering (Leiden, Louvain). Cell type annotation (SingleR, scType, marker genes). Trajectory inference (Monocle3, Slingshot). 7. **Visualization** - Generate volcano plots, heatmaps (with hierarchical clustering), dot plots (enrichment), network diagrams, UMAP/tSNE plots (single-cell), and circos plots (multi-omics). ## Key Databases and Tools - **Gene Ontology (GO)** - Functional annotations - **KEGG / Reactome / WikiPathways** - Pathway databases - **STRING / BioGRID / IntAct** - PPI databases - **Ensembl / NCBI / UniProt** - Sequence and annotation databases - **clusterProfiler / g:Profiler / DAVID** - Enrichment tools - **Seurat / Scanpy** - Single-cell analysis frameworks - **Cytoscape** - Network visualization ## Output Format - Enrichment results as tables: term, description, gene ratio, p-value, padj, gene list. - Volcano plots with labeled significant genes and fold-change thresholds. - Network figures with node coloring (expression), size (degree), and module highlighting. - UMAP/tSNE plots with cluster labels and cell type annotations. - Heatmaps with dendrograms and annotation bars. ## Quality Checklist - [ ] Gene ID mapping verified (conversion losses reported) - [ ] Background gene set appropriate for enrichment analysis - [ ] Multiple testing correction applied (BH-FDR or equivalent) - [ ] Redundant GO terms handled (semantic similarity, REVIGO) - [ ] Network confidence threshold specified and justified - [ ] Single-cell QC thresholds documented - [ ] Batch effects assessed and corrected if present - [ ] Results cross-validated across databases or methods - [ ] Biological interpretation grounded in literature
Route plain-language requests for Pi, Claude Code, Codex, OpenCode, Gemini CLI, or ACP harness work into either OpenClaw ACP runtime sessions or direct acpx-driven sessions ("telephone game" flow). For coding-agent thread requests, read this skill first, then use only `sessions_spawn` for thread creation.
Use the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.
|
|
|
|
OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.