bio-chipseq-peak-calling
This Claude Code skill provides command-line templates for ChIP-seq peak calling using MACS3 or MACS2, enabling users to identify transcription factor binding sites as narrow peaks or histone modification domains as broad peaks from aligned BAM files with optional input control normalization. Use this skill when processing ChIP-seq alignment data to generate narrowPeak or broadPeak BED format outputs for downstream genomic analysis.
git clone --depth 1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills /tmp/bio-chipseq-peak-calling && cp -r /tmp/bio-chipseq-peak-calling/skills/bio-chipseq-peak-calling ~/.claude/skills/bio-chipseq-peak-callingSKILL.md
## Version Compatibility
Reference examples tested with: MACS2 2.2+, MACS3 3.0+
Before using code patterns, verify installed versions match. If versions differ:
- CLI: `<tool> --version` then `<tool> --help` to confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Peak Calling with MACS3
**"Call peaks from my ChIP-seq data"** → Identify significantly enriched regions (narrow peaks for TFs, broad peaks for histone marks) by comparing IP signal to input control.
- CLI: `macs3 callpeak -t chip.bam -c input.bam -f BAM -g hs -n sample`
MACS3 is the actively developed successor to MACS2. Commands are identical except the binary name. MACS2 is in maintenance mode.
## Basic Peak Calling
**Goal:** Call enriched regions from ChIP-seq alignments with input control normalization.
**Approach:** Compare treatment BAM signal against input control using MACS3 local Poisson model.
```bash
# Call peaks with input control (recommended)
macs3 callpeak -t chip.bam -c input.bam -f BAM -g hs -n sample --outdir peaks/
# For MACS2 (legacy), replace 'macs3' with 'macs2' - syntax is identical
```
## Without Input Control
**Goal:** Call peaks without a matched input/control sample.
**Approach:** Use MACS3 with genomic background estimation only (less accurate than with control).
```bash
# Not recommended, but possible
macs3 callpeak -t chip.bam -f BAM -g hs -n sample --outdir peaks/
```
## Narrow Peaks (TF, H3K4me3, H3K27ac)
**Goal:** Call sharp, well-defined peaks typical of transcription factors and active histone marks.
**Approach:** Use default narrow peak mode with q-value filtering and genome size correction.
```bash
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \ # hs=human, mm=mouse, ce=worm, dm=fly
-n sample_narrow \
--outdir peaks/ \
-q 0.05 # q-value threshold
```
## Broad Peaks (H3K36me3, H3K27me3, H3K9me3)
**Goal:** Call diffuse, broad enrichment domains typical of repressive or elongation-associated histone marks.
**Approach:** Enable broad peak mode which links nearby enriched regions into broader domains.
```bash
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
-n sample_broad \
--outdir peaks/ \
--broad \ # Broad peak mode
--broad-cutoff 0.1 # Broad peak q-value
```
## Paired-End Data
**Goal:** Call peaks from paired-end sequencing using actual fragment sizes instead of modeled estimates.
**Approach:** Use BAMPE format so MACS3 calculates fragment size from mate pairs directly.
```bash
# MACS3 uses BAMPE format for paired-end
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAMPE \ # Paired-end BAM
-g hs \
-n sample_pe \
--outdir peaks/
```
## Multiple Replicates
**Goal:** Call peaks from multiple biological replicates pooled together for increased statistical power.
**Approach:** Provide all replicate BAMs to MACS3, which internally pools reads before peak calling.
```bash
# Pool replicates (MACS3 handles internally)
macs3 callpeak \
-t rep1.bam rep2.bam rep3.bam \
-c input.bam \
-f BAM \
-g hs \
-n pooled \
--outdir peaks/
```
## Custom Genome Size
**Goal:** Call peaks for non-model organisms without a built-in genome size shortcut.
**Approach:** Provide the effective genome size as a numeric value instead of a species abbreviation.
```bash
# For non-model organisms or custom genomes
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g 2.7e9 \ # Effective genome size in bp
-n sample \
--outdir peaks/
```
## Common Genome Sizes
| Genome | Flag | Effective Size |
|--------|------|----------------|
| Human | hs | 2.7e9 |
| Mouse | mm | 1.87e9 |
| C. elegans | ce | 9e7 |
| D. melanogaster | dm | 1.2e8 |
## Fixed Fragment Size
**Goal:** Call peaks when fragment size modeling fails or a specific extension size is needed.
**Approach:** Bypass model building and specify a fixed read extension size manually.
```bash
# If modeling fails or for ATAC-seq
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
--nomodel \ # Skip model building
--extsize 200 \ # Fixed extension size
-n sample \
--outdir peaks/
```
## Generate Signal Tracks
**Goal:** Produce normalized signal tracks for genome browser visualization alongside peak calls.
**Approach:** Enable bedGraph output with signal-per-million-reads normalization, then convert to bigWig.
```bash
# Generate bedGraph and bigWig files
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
-n sample \
--outdir peaks/ \
-B \ # Generate bedGraph
--SPMR # Signal per million reads
# Convert to bigWig (requires UCSC tools)
sort -k1,1 -k2,2n peaks/sample_treat_pileup.bdg > peaks/sample.sorted.bdg
bedGraphToBigWig peaks/sample.sorted.bdg chrom.sizes peaks/sample.bw
```
## Local Lambda for Broad Marks
**Goal:** Improve broad peak calling by disabling the genome-wide lambda estimate.
**Approach:** Use --nolambda to rely solely on local background estimation for very broad domains.
```bash
# Recommended for very broad marks
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
--broad \
--nolambda \ # Use local lambda only
-n sample \
--outdir peaks/
```
## Cutoff Analysis
**Goal:** Evaluate how different significance thresholds affect the number of called peaks.
**Approach:** Run MACS3 cutoff analysis mode to generate a table of peak counts at various q-value cutoffs.
```bash
# Test different q-value cutoffs
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
--cutoff-anCloud laboratory platform for automated protein testing and validation. Use when designing proteins and needing experimental validation including binding assays, expression testing, thermostability measurements, enzyme activity assays, or protein sequence optimization. Also use for submitting experiments via API, tracking experiment status, downloading results, optimizing protein sequences for better expression using computational tools (NetSolP, SoluProt, SolubleMPNN, ESM), or managing protein design workflows with wet-lab validation.
Time-blind friendly planning, executive function support, and daily structure for ADHD brains. Specializes in realistic time estimation, dopamine-aware task design, and building systems that
This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.
Browse the web for any task — research topics, read articles, interact with web apps, fill forms, take screenshots, extract data, and test web pages. Use whenever a browser would be useful, not just when the user explicitly asks.
AI驱动的综合健康分析系统,整合多维度健康数据、识别异常模式、预测健康风险、提供个性化建议。支持智能问答和AI健康报告生成。
Access AlphaFold's 200M+ AI-predicted protein structures. Retrieve structures by UniProt ID, download PDB/mmCIF files, analyze confidence metrics (pLDDT, PAE), for drug discovery and structural biology.