Skip to main content
ClaudeWave
Skill2.7k estrellas del repoactualizado 2mo ago

bio-bedgraph-handling

This skill provides methods for creating, manipulating, and converting bedGraph files used in genome browser visualization. It covers bedGraph format specifications, generation from BAM files using bedtools genomecov with options for strand-specificity and normalization, sorting requirements, and conversion to bigWig format using UCSC tools. Use this when processing coverage and signal tracks from ChIP-seq, ATAC-seq, or RNA-seq experiments for genome visualization.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills /tmp/bio-bedgraph-handling && cp -r /tmp/bio-bedgraph-handling/skills/bio-bedgraph-handling ~/.claude/skills/bio-bedgraph-handling
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA

-->

---
name: bio-bedgraph-handling
description: Create, manipulate, and convert bedGraph files for genome browser visualization. Covers bedGraph format, conversion to/from bigWig, normalization, and signal processing. Use when handling coverage and signal tracks from ChIP-seq, ATAC-seq, or RNA-seq.
tool_type: mixed
primary_tool: pyBigWig
measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes.
allowed-tools:
  - read_file
  - run_shell_command
---

# bedGraph Handling

bedGraph is a text format for displaying continuous-valued data on genome browsers. Common for coverage, signal intensity, and scores.

## bedGraph Format

```
track type=bedGraph name="Sample" description="Coverage"
chr1    0       100     1.5
chr1    100     200     2.3
chr1    200     300     0.8
```

Four columns: chrom, start, end, value (0-based, half-open)

## Create bedGraph from BAM

### Using bedtools genomecov

```bash
bedtools genomecov -ibam sample.bam -bg > sample.bedgraph
bedtools genomecov -ibam sample.bam -bg -split > sample.bedgraph
bedtools genomecov -ibam sample.bam -bg -scale 1.5 > sample.scaled.bedgraph
```

### Strand-Specific

```bash
bedtools genomecov -ibam sample.bam -bg -strand + > sample.plus.bedgraph
bedtools genomecov -ibam sample.bam -bg -strand - > sample.minus.bedgraph
```

### 5' End Coverage (ChIP-seq)

```bash
bedtools genomecov -ibam sample.bam -bg -5 > sample.5prime.bedgraph
```

### Normalize by Library Size (CPM)

```bash
total_reads=$(samtools view -c -F 260 sample.bam)
scale=$(echo "scale=10; 1000000 / $total_reads" | bc)

bedtools genomecov -ibam sample.bam -bg -scale $scale > sample.cpm.bedgraph
```

## Sort bedGraph

bedGraph must be sorted for conversion to bigWig.

```bash
sort -k1,1 -k2,2n sample.bedgraph > sample.sorted.bedgraph
LC_ALL=C sort -k1,1 -k2,2n sample.bedgraph > sample.sorted.bedgraph
```

## Convert bedGraph to bigWig

### Using UCSC bedGraphToBigWig

```bash
bedGraphToBigWig sample.sorted.bedgraph chrom.sizes sample.bw
fetchChromSizes hg38 > hg38.chrom.sizes
bedGraphToBigWig sample.sorted.bedgraph hg38.chrom.sizes sample.bw
```

### Generate chrom.sizes

```bash
samtools faidx reference.fa
cut -f1,2 reference.fa.fai > chrom.sizes
fetchChromSizes hg38 > hg38.chrom.sizes
mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -e \
    "select chrom, size from hg38.chromInfo" > hg38.chrom.sizes
```

### Clip to Chromosome Boundaries

```bash
bedClip sample.bedgraph chrom.sizes sample.clipped.bedgraph
bedGraphToBigWig sample.clipped.bedgraph chrom.sizes sample.bw
```

## Convert bigWig to bedGraph

```bash
bigWigToBedGraph sample.bw sample.bedgraph
bigWigToBedGraph sample.bw sample.chr1.bedgraph -chrom=chr1
bigWigToBedGraph sample.bw sample.region.bedgraph -chrom=chr1 -start=1000 -end=2000
```

## Merge bedGraph Files

### Using bedtools unionbedg

```bash
bedtools unionbedg -i sample1.bedgraph sample2.bedgraph sample3.bedgraph \
    -header -names sample1 sample2 sample3 > merged.bedgraph
```

### Average Across Samples

```bash
bedtools unionbedg -i sample1.bedgraph sample2.bedgraph sample3.bedgraph | \
    awk '{sum=0; for(i=4;i<=NF;i++) sum+=$i; print $1,$2,$3,sum/(NF-3)}' OFS='\t' \
    > average.bedgraph
```

## Mathematical Operations

### bedtools map for Region Statistics

```bash
bedtools map -a regions.bed -b sample.bedgraph -c 4 -o mean > region_means.bed
bedtools map -a regions.bed -b sample.bedgraph -c 4 -o sum > region_sums.bed
bedtools map -a regions.bed -b sample.bedgraph -c 4 -o max > region_max.bed
```

### Subtract Background

```bash
bedtools unionbedg -i treatment.bedgraph input.bedgraph | \
    awk '{diff=$4-$5; if(diff<0) diff=0; print $1,$2,$3,diff}' OFS='\t' \
    > subtracted.bedgraph
```

### Log Transform

```bash
awk '{print $1,$2,$3,log($4+1)/log(2)}' OFS='\t' sample.bedgraph > sample.log2.bedgraph
```

### Smooth Signal

```bash
bedtools slop -i sample.bedgraph -g chrom.sizes -b 50 | \
    bedtools merge -i - -c 4 -o mean > smoothed.bedgraph
```

## Python with pyBigWig

### Write bedGraph

```python
import pyBigWig

bw = pyBigWig.open('output.bedgraph', 'w')
bw.addHeader([('chr1', 248956422), ('chr2', 242193529)])

chroms = ['chr1', 'chr1', 'chr1']
starts = [0, 100, 200]
ends = [100, 200, 300]
values = [1.5, 2.3, 0.8]
bw.addEntries(chroms, starts, ends=ends, values=values)
bw.close()
```

### Read bigWig to bedGraph Format

```python
import pyBigWig

bw = pyBigWig.open('sample.bw')

for chrom, size in bw.chroms().items():
    intervals = bw.intervals(chrom)
    if intervals:
        for start, end, value in intervals:
            print(f'{chrom}\t{start}\t{end}\t{value}')

bw.close()
```

### Convert bigWig Region to bedGraph

```python
import pyBigWig

bw = pyBigWig.open('sample.bw')
intervals = bw.intervals('chr1', 1000000, 2000000)

with open('region.bedgraph', 'w') as f:
    for start, end, value in intervals:
        f.write(f'chr1\t{start}\t{end}\t{value}\n')

bw.close()
```

## deepTools for Normalization

### bamCoverage (BAM to bedGraph/bigWig)

```bash
bamCoverage -b sample.bam -o sample.bw --normalizeUsing RPKM
bamCoverage -b sample.bam -o sample.bw --normalizeUsing CPM
bamCoverage -b sample.bam -o sample.bw --normalizeUsing BPM
bamCoverage -b sample.bam -o sample.bedgraph --outFileFormat bedgraph --normalizeUsing CPM
```

### bamCompare (Treatment vs Control)

```bash
bamCompare -b1 treatment.bam -b2 input.bam -o log2ratio.bw --scaleFactorsMethod readCount
bamCompare -b1 treatment.bam -b2 input.bam -o subtracted.bw --ratio subtract
```

### bigwigCompare

```bash
bigwigCompare -b1 treatment.bw -b2 input.bw -o ratio.bw --ratio log2
bigwigCompare -b
aav-vector-design-agentSkill
adaptyvSkill

Cloud laboratory platform for automated protein testing and validation. Use when designing proteins and needing experimental validation including binding assays, expression testing, thermostability measurements, enzyme activity assays, or protein sequence optimization. Also use for submitting experiments via API, tracking experiment status, downloading results, optimizing protein sequences for better expression using computational tools (NetSolP, SoluProt, SolubleMPNN, ESM), or managing protein design workflows with wet-lab validation.

adhd-daily-plannerSkill

Time-blind friendly planning, executive function support, and daily structure for ADHD brains. Specializes in realistic time estimation, dopamine-aware task design, and building systems that

aeonSkill

This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.

agent-browserSkill

Browse the web for any task — research topics, read articles, interact with web apps, fill forms, take screenshots, extract data, and test web pages. Use whenever a browser would be useful, not just when the user explicitly asks.

agentd-drug-discoverySkill
ai-analyzerSkill

AI驱动的综合健康分析系统,整合多维度健康数据、识别异常模式、预测健康风险、提供个性化建议。支持智能问答和AI健康报告生成。

alphafold-databaseSkill

Access AlphaFold's 200M+ AI-predicted protein structures. Retrieve structures by UniProt ID, download PDB/mmCIF files, analyze confidence metrics (pLDDT, PAE), for drug discovery and structural biology.