Skill2.9k repo starsupdated 7d ago

bio-causal-genomics-pleiotropy-detection

This skill detects and corrects horizontal pleiotropy in Mendelian randomization analyses through three complementary methods: MR-PRESSO identifies and removes pleiotropic outlier variants while testing whether their removal significantly changes causal estimates, MR-Egger regression tests for directional pleiotropy via the intercept term, and Steiger filtering validates variant directionality. Use this when validating causal inference results for pleiotropic bias, screening instruments for violations of the exclusion restriction assumption, or conducting sensitivity analyses on MR studies.

View source Repository: OpenClaw-Medical-Skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills /tmp/bio-causal-genomics-pleiotropy-detection && cp -r /tmp/bio-causal-genomics-pleiotropy-detection/skills/bio-causal-genomics-pleiotropy-detection ~/.claude/skills/bio-causal-genomics-pleiotropy-detection

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

## Version Compatibility

Reference examples tested with: MR-PRESSO 1.0+, TwoSampleMR 0.5+

Before using code patterns, verify installed versions match. If versions differ:
- R: `packageVersion("<pkg>")` then `?function_name` to verify parameters

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Pleiotropy Detection

**"Check my MR results for pleiotropic bias"** → Detect and correct for horizontal pleiotropy using outlier removal (MR-PRESSO), directional pleiotropy testing (MR-Egger intercept), and variant directionality filtering (Steiger) to validate causal inference results.
- R: `MRPRESSO::mr_presso()` for global and distortion tests
- R: `TwoSampleMR::mr_egger_regression()` for Egger intercept test

## Overview

Horizontal pleiotropy violates the exclusion restriction assumption of MR: instruments
affect the outcome through pathways other than the exposure. Detecting and correcting
for pleiotropy is essential for valid causal inference.

Types of pleiotropy:
- **Vertical** (mediated): Instrument -> exposure -> outcome (valid, not a problem)
- **Horizontal** (direct): Instrument -> outcome bypassing exposure (violates MR assumptions)
- **Balanced**: Pleiotropic effects cancel out (IVW still valid, Egger intercept ~0)
- **Directional**: Pleiotropic effects are systematic (biases IVW, Egger detects this)

## MR-PRESSO

**Goal:** Detect and remove pleiotropic outlier instruments from an MR analysis.

**Approach:** Run MR-PRESSO to test for global pleiotropy, identify individual outlier SNPs, test whether their removal changes the causal estimate (distortion test), and obtain a corrected estimate.

```r
# install.packages('remotes')
# remotes::install_github('rondolab/MR-PRESSO')
library(MRPRESSO)

# Input: harmonized data from TwoSampleMR
# Columns needed: beta.exposure, beta.outcome, se.exposure, se.outcome
presso_input <- data.frame(
  bx = dat$beta.exposure,
  by = dat$beta.outcome,
  bxse = dat$se.exposure,
  byse = dat$se.outcome
)

# --- Run MR-PRESSO ---
# NbDistribution: Number of simulations for null distribution (minimum 1000)
# SignifThreshold: P-value threshold for outlier detection (0.05 standard)
presso_result <- mr_presso(
  BetaOutcome = 'by', BetaExposure = 'bx',
  SdOutcome = 'byse', SdExposure = 'bxse',
  OUTLIERtest = TRUE, DISTORTIONtest = TRUE,
  data = presso_input,
  NbDistribution = 5000,
  SignifThreshold = 0.05
)

# --- Global test ---
# Tests whether there is any pleiotropy among instruments
# Significant p-value: Evidence of horizontal pleiotropy
global_p <- presso_result$`MR-PRESSO results`$`Global Test`$Pvalue
cat('Global test p-value:', global_p, '\n')

# --- Outlier test ---
# Identifies individual pleiotropic SNPs
outliers <- presso_result$`MR-PRESSO results`$`Outlier Test`
cat('\nOutlier test results:\n')
print(outliers)

# Outlier SNPs (p < 0.05)
outlier_indices <- which(outliers$Pvalue < 0.05)
cat('Outlier SNPs:', length(outlier_indices), '\n')

# --- Distortion test ---
# Tests whether removing outliers significantly changes the causal estimate
# Significant: Outliers were meaningfully biasing the estimate
distortion_p <- presso_result$`MR-PRESSO results`$`Distortion Test`$Pvalue
cat('Distortion test p-value:', distortion_p, '\n')

# --- Corrected estimate ---
# MR estimate after removing outlier SNPs
main_results <- presso_result$`Main MR results`
cat('\nRaw IVW estimate:', main_results$`Causal Estimate`[1], '\n')
cat('Corrected IVW estimate:', main_results$`Causal Estimate`[2], '\n')
```

## MR-Egger Diagnostics

**Goal:** Test for directional pleiotropy and obtain a pleiotropy-adjusted causal estimate.

**Approach:** Fit MR-Egger regression where the intercept estimates average pleiotropic bias, and check I-squared for instrument strength under the NOME assumption.

```r
library(TwoSampleMR)

# MR-Egger regression allows for a non-zero intercept
# The intercept estimates the average pleiotropic effect
egger <- mr_egger_regression(dat$beta.exposure, dat$beta.outcome,
                              dat$se.exposure, dat$se.outcome)

# --- Egger intercept ---
# Significant intercept (p < 0.05): Directional pleiotropy present
# Non-significant: No evidence (but low power with < 10 SNPs)
cat('Egger intercept:', round(egger$b_i, 5), '\n')
cat('Intercept SE:', round(egger$se_i, 5), '\n')
cat('Intercept p-value:', format.pval(egger$pval_i), '\n')

# --- Egger slope ---
# Valid causal estimate EVEN with directional pleiotropy (InSIDE assumption)
cat('\nEgger causal estimate:', round(egger$b, 4), '\n')
cat('Egger SE:', round(egger$se, 4), '\n')
cat('Egger p-value:', format.pval(egger$pval), '\n')

# --- I-squared for Egger ---
# I^2 measures instrument strength for MR-Egger specifically
# I^2 > 0.9: Egger estimate reliable
# I^2 < 0.6: Egger has low power, interpret with caution (NOME violation)
isq <- Isq(dat$beta.exposure, dat$se.exposure)
cat('\nI-squared:', round(isq, 3), '\n')
if (isq < 0.9) cat('Warning: I-squared < 0.9; Egger estimate may be unreliable (NOME violation)\n')
```

## Steiger Filtering

**Goal:** Verify that instruments act in the correct causal direction (exposure -> outcome, not reverse).

**Approach:** Apply the Steiger test to each instrument, remove those explaining more outcome variance than exposure variance, and re-run MR on filtered instruments.

```r
library(TwoSampleMR)

# Steiger test: Verify each instrument explains more variance in
# the exposure than the outcome. Instruments failing this test
# may act through a reverse causal pathway.

steiger <- steiger_filtering(dat)

# Keep only correctly oriented instruments
dat_steiger <- steiger[steiger$steiger_dir == TRUE, ]
cat('Instruments passing Steiger filter:', nrow(dat_steiger), 'of', nrow(steiger), '\n')

# Re-run MR with filtered instruments
results_steiger <- mr(dat_steiger)
print(results_steiger[, c('method', 'nsnp', 'b', 'se', 'pval')])

# Directionality tes

More from this repository

aav-vector-design-agentSkill

adaptyvSkill

Cloud laboratory platform for automated protein testing and validation. Use when designing proteins and needing experimental validation including binding assays, expression testing, thermostability measurements, enzyme activity assays, or protein sequence optimization. Also use for submitting experiments via API, tracking experiment status, downloading results, optimizing protein sequences for better expression using computational tools (NetSolP, SoluProt, SolubleMPNN, ESM), or managing protein design workflows with wet-lab validation.

adhd-daily-plannerSkill

Time-blind friendly planning, executive function support, and daily structure for ADHD brains. Specializes in realistic time estimation, dopamine-aware task design, and building systems that

aeonSkill

This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.

agent-browserSkill

Browse the web for any task — research topics, read articles, interact with web apps, fill forms, take screenshots, extract data, and test web pages. Use whenever a browser would be useful, not just when the user explicitly asks.

agentd-drug-discoverySkill

ai-analyzerSkill

AI驱动的综合健康分析系统，整合多维度健康数据、识别异常模式、预测健康风险、提供个性化建议。支持智能问答和AI健康报告生成。

alphafold-databaseSkill

Access AlphaFold's 200M+ AI-predicted protein structures. Retrieve structures by UniProt ID, download PDB/mmCIF files, analyze confidence metrics (pLDDT, PAE), for drug discovery and structural biology.