macs3-peak-calling
Poisson-model peak caller for ChIP-seq/ATAC-seq BAMs. MACS3 callpeak finds enriched regions (TF sites or histone marks) vs input/IgG; outputs BED narrowPeak/broadPeak for motif analysis, annotation, and differential binding. Use narrow peaks for TF ChIP-seq and ATAC-seq; broad for H3K27me3, H3K9me3, and other broad marks.
git clone --depth 1 https://github.com/jaechang-hits/SciAgent-Skills /tmp/macs3-peak-calling && cp -r /tmp/macs3-peak-calling/skills/genomics-bioinformatics/macs3-peak-calling ~/.claude/skills/macs3-peak-callingSKILL.md
# MACS3 — ChIP-seq and ATAC-seq Peak Caller
## Overview
MACS3 (Model-based Analysis of ChIP-seq) identifies regions of significant read enrichment (peaks) from ChIP-seq, ATAC-seq, CUT&RUN, and CUT&TAG experiments. It models the fragment length distribution from paired-end data or estimates it from mono-nucleosomal read shifting in single-end data, then applies a Poisson model to identify fold-enrichment over an input/IgG control. MACS3 produces BED-format narrowPeak (for transcription factors) or broadPeak (for histone marks) files with signal and q-value tracks for visualization in IGV or UCSC Genome Browser.
## When to Use
- Calling transcription factor binding peaks from ChIP-seq experiments (use `--nomodel --extsize 200` or let MACS3 estimate fragment length)
- Identifying open chromatin regions from ATAC-seq experiments (use `--nomodel --shift -100 --extsize 200 -f BAMPE`)
- Calling broad histone modification peaks (H3K27me3, H3K9me3, H3K36me3) with `--broad`
- Generating peak signal tracks (bedGraph/bigWig) for genome browser visualization with `-B --SPMR`
- Performing differential binding analysis: MACS3 peaks as input to DiffBind or DESeq2
- Use **HMMRATAC** (part of MACS3) for nucleosome-resolution ATAC-seq peak calling
- Use **SPP** or **HOMER** as alternatives; MACS3 is the ENCODE-recommended standard
## Prerequisites
- **Python packages**: `macs3` (Python ≥ 3.8)
- **Input**: Sorted BAM files (with index) from ChIP-seq or ATAC-seq alignment (e.g., using STAR or Bowtie2)
- **Optional**: Input/IgG control BAM for background normalization
> **Check before installing**: The tool may already be available in the current environment (e.g., inside a `pixi` / `conda` env). Run `command -v macs3` first and skip the install commands below if it returns a path. When running inside a pixi project, invoke the tool via `pixi run macs3` rather than bare `macs3`.
```bash
# Install with pip or conda
pip install macs3
# or
conda install -c bioconda macs3
# Verify
macs3 --version
# macs3 3.0.2
```
## Quick Start
```bash
# Call peaks for TF ChIP-seq (narrow peaks, with input control)
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
-n sample_tf \
--outdir peaks/ \
-q 0.05
# Output: peaks/sample_tf_peaks.narrowPeak
wc -l peaks/sample_tf_peaks.narrowPeak
```
## Workflow
### Step 1: Prepare Input BAM Files
MACS3 requires sorted, indexed BAM files from genome alignment.
```bash
# Sort and index ChIP and control BAMs (if not already done)
samtools sort -@ 8 chip_raw.bam -o chip.bam
samtools sort -@ 8 input_raw.bam -o input.bam
samtools index chip.bam
samtools index input.bam
# Check read counts
echo "ChIP reads: $(samtools view -c -F 4 chip.bam)"
echo "Input reads: $(samtools view -c -F 4 input.bam)"
```
### Step 2: Call Narrow Peaks (TF ChIP-seq)
Use the default mode for transcription factor binding site identification.
```bash
# TF ChIP-seq with input control
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
-n tf_chip \
--outdir peaks/ \
-q 0.05 \
--keep-dup auto
echo "Peaks called: $(wc -l < peaks/tf_chip_peaks.narrowPeak)"
echo "Summit file: peaks/tf_chip_summits.bed"
# Without input control (less recommended)
macs3 callpeak \
-t chip.bam \
-f BAM \
-g hs \
-n tf_noinput \
--outdir peaks/ \
--nolambda
```
### Step 3: Call Broad Peaks (Histone Marks)
Use `--broad` for spread histone modifications like H3K27me3 or H3K36me3.
```bash
# H3K27me3 broad histone mark
macs3 callpeak \
-t h3k27me3.bam \
-c input.bam \
-f BAM \
-g hs \
-n h3k27me3 \
--outdir peaks/ \
--broad \
--broad-cutoff 0.1 \
-q 0.05
echo "Broad peaks: $(wc -l < peaks/h3k27me3_peaks.broadPeak)"
# H3K4me3 (sharp mark — use narrow peaks)
macs3 callpeak \
-t h3k4me3.bam \
-c input.bam \
-f BAM \
-g hs \
-n h3k4me3 \
--outdir peaks/ \
-q 0.05
```
### Step 4: Call ATAC-seq Peaks
ATAC-seq requires special handling for the Tn5 insertion site.
```bash
# ATAC-seq with paired-end BAM (recommended)
macs3 callpeak \
-t atac.bam \
-f BAMPE \
-g hs \
-n atac_sample \
--outdir peaks/ \
--nomodel \
--nolambda \
-q 0.05 \
--keep-dup all
echo "ATAC peaks: $(wc -l < peaks/atac_sample_peaks.narrowPeak)"
# Single-end ATAC-seq: shift reads to center on Tn5 cut site
macs3 callpeak \
-t atac_se.bam \
-f BAM \
-g hs \
-n atac_se \
--outdir peaks/ \
--nomodel \
--shift -100 \
--extsize 200 \
--keep-dup all
```
### Step 5: Generate Signal Tracks for Visualization
Produce bedGraph and bigWig files for genome browser visualization.
```bash
# Generate bedGraph normalized to million reads (SPMR)
macs3 callpeak \
-t chip.bam \
-c input.bam \
-f BAM \
-g hs \
-n chip_track \
--outdir tracks/ \
-B \
--SPMR \
--keep-dup auto
# Convert bedGraph to bigWig for IGV/UCSC
# Requires bedGraphToBigWig and chrom.sizes
sort -k1,1 -k2,2n tracks/chip_track_treat_pileup.bdg > tracks/chip_sorted.bdg
bedGraphToBigWig tracks/chip_sorted.bdg genome/hg38.chrom.sizes tracks/chip.bw
echo "BigWig track: tracks/chip.bw"
```
### Step 6: Annotate and Analyze Peaks
Parse narrowPeak output and annotate peaks to genomic features.
```python
import pandas as pd
# Load narrowPeak file
# Columns: chrom, start, end, name, score, strand, signalValue, pValue, qValue, peak
cols = ["chrom", "start", "end", "name", "score", "strand",
"signalValue", "pValue", "qValue", "peak"]
peaks = pd.read_csv("peaks/tf_chip_peaks.narrowPeak", sep="\t",
header=None, names=cols)
print(f"Total peaks: {len(peaks)}")
print(f"Peaks on chr1: {(peaks['chrom'] == 'chr1').sum()}")
print(f"Median peak width: {(peaks['end'] - peaks['start']).median():.0f} bp")
print(f"Peaks with q-value < 0.01: {(peaks['qValue'] > 2).sum()}") # -log10(q) > 2
# Filter high-confidence peaks
high_conf =|
Opentrons Protocol API v2 for OT-2/Flex: Python protocols for pipetting, serial dilutions, PCR, plate replication; control thermocycler, heater-shaker, magnetic, temperature modules. Use pylabrobot for multi-vendor.
Interactive visualization with Plotly. 40+ chart types (scatter, line, heatmap, 3D, geographic) with hover, zoom, pan. Two APIs: Plotly Express (DataFrame) and Graph Objects (fine control). For static publication figures use matplotlib; for statistical grammar use seaborn.
Statistical visualization on matplotlib + pandas. Distributions (histplot, kdeplot, violin, box), relational (scatter, line), categorical, regression, correlation heatmaps. Auto aggregation/CIs. Use plotly for interactive; matplotlib for low-level.
Best practices for single-cell RNA-seq cell type annotation including marker-based, reference-based, and automated classification approaches.
Bayesian modeling with PyMC 5: priors, likelihood, NUTS/ADVI sampling, diagnostics (R-hat, ESS), LOO/WAIC comparison, prediction. Hierarchical, logistic, GP variants; predictive checks.
Time-to-event modeling with scikit-survival: Cox PH (elastic net), Random Survival Forests, Boosting, SVMs for censored data. C-index, Brier, time-dependent AUC; Kaplan-Meier, Nelson-Aalen, competing risks. Pipeline/GridSearchCV compatible. Use statsmodels for frequentist, pymc for Bayesian, lifelines for parametric.
>-