Skip to main content
ClaudeWave
Skill199 repo starsupdated 16d ago

clinpgx-database

Query the ClinPGx (formerly PharmGKB) REST API plus the CPIC PostgREST companion API for pharmacogenomic clinical annotations, CPIC/DPWG dosing guidelines, gene-drug pairs, variant-drug associations, FDA/EMA drug labels, and PGx pathways. Two-host architecture: api.clinpgx.org for annotation records, api.cpicpgx.org for genotype→recommendation lookups. No auth. For germline pathogenicity use clinvar-database; for somatic cancer PGx use cosmic-database or opentargets-database; for drug bioactivity use chembl-database-bioactivity.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/jaechang-hits/SciAgent-Skills /tmp/clinpgx-database && cp -r /tmp/clinpgx-database/skills/genomics-bioinformatics/databases/clinpgx-database ~/.claude/skills/clinpgx-database
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# ClinPGx (PharmGKB) Pharmacogenomics Database

## Overview

PharmGKB rebranded as **ClinPGx** in 2024 and the API moved from `api.pharmgkb.org` to `api.clinpgx.org`. The old host now returns 404/405; every example here uses the new endpoints. Two complementary APIs are used together:

- **ClinPGx Data API** (`api.clinpgx.org/v1`) — record-style access to genes, drugs, variants, clinical annotations, guideline annotations, drug labels, and pathways. Responses wrap data as `{"data": [...], "status": "success"}`. Filters use dotted property paths (e.g. `relatedChemicals.name=clopidogrel`, `levelOfEvidence.term=1A`).
- **CPIC PostgREST API** (`api.cpicpgx.org/v1`) — relational lookup of genotype → drug recommendation rows. PostgREST filter syntax (`column=eq.value`, JSON `cs.{...}` for jsonb containment). Returns flat JSON arrays.

Use ClinPGx for *what is known* about a gene/drug/variant; use CPIC for *how to prescribe* given a phenotype. The pattern is `ClinPGx for annotations, CPIC for recommendations`.

## When to Use

- Retrieving CPIC genotype-specific dosing recommendations for a gene-drug pair (e.g., CYP2C19 + clopidogrel) — use CPIC
- Looking up all pharmacogenomic clinical annotations for a drug or evidence level — use ClinPGx `data/clinicalAnnotation`
- Finding all CPIC/DPWG guideline annotations for a pharmacogene — use ClinPGx `data/guidelineAnnotation`
- Resolving a gene symbol, drug name, or rsID to ClinPGx PA identifiers — use `data/{gene,drug,variant}`
- Free-text search across all ClinPGx record types (genes, drugs, variants, annotations) — use `POST /site/search`
- Retrieving FDA/EMA pharmacogenomic drug label annotations — use ClinPGx `data/label`
- Building precision-medicine prescribing workflows that combine annotation evidence with phenotype-specific recommendations
- For germline disease pathogenicity (not PGx) use `clinvar-database`
- For somatic cancer pharmacogenomics use `cosmic-database` or `opentargets-database`

## Prerequisites

- **Python packages**: `requests`, `pandas` — both already in standard environments
- **Data requirements**: HGNC gene symbols, drug names (lowercase generic), dbSNP rsIDs, or PA identifiers
- **Environment**: internet connection; no authentication required for either host
- **Rate limits**: the ClinPGx host occasionally returns HTTP 429; insert `time.sleep(0.3–0.5)` between sequential calls. CPIC is more permissive.

If you are inside a pixi/conda environment that already provides `requests` and `pandas`, skip the install — invoke scripts with `pixi run python ...`.

```bash
pip install requests pandas
```

## Quick Start

```python
import requests

CLINPGX = "https://api.clinpgx.org/v1"
CPIC    = "https://api.cpicpgx.org/v1"

# CPIC genotype → recommendation: clopidogrel + CYP2C19 Poor Metabolizer
drug = requests.get(f"{CPIC}/drug", params={"name": "eq.clopidogrel"}).json()[0]
recs = requests.get(f"{CPIC}/recommendation",
                    params={"drugid": f"eq.{drug['drugid']}",
                            "phenotypes": 'cs.{"CYP2C19":"Poor Metabolizer"}'}).json()
print(f"clopidogrel CYP2C19=PM: {len(recs)} recommendation(s)")
for rec in recs[:2]:
    print(f"  [{rec['classification']}] {rec['drugrecommendation'][:80]}…")

# ClinPGx side: how many CPIC guideline annotations cover CYP2C19?
glines = requests.get(f"{CLINPGX}/data/guidelineAnnotation",
                      params={"relatedGenes.symbol": "CYP2C19",
                              "source": "CPIC", "view": "base"}).json()["data"]
print(f"CYP2C19 CPIC guidelines: {len(glines)}")
```

## Core API

### Module 1: Free-text site search

`POST /site/search` with a JSON body `{"query": "<term>"}` is the canonical entry point when you don't know the PA ID. It searches across drugs, genes, variants, clinical annotations, guideline annotations, and labels in one shot.

```python
import requests

CLINPGX = "https://api.clinpgx.org/v1"

r = requests.post(f"{CLINPGX}/site/search",
                  json={"query": "rs4149056"}, timeout=15)
r.raise_for_status()
hits = r.json()["data"]["hits"]
print(f"Total hits: {r.json()['data']['total']}")
for h in hits[:5]:
    print(f"  id={h.get('id')}  name={h.get('name')[:80]}")
```

```python
# Broader concept search
r = requests.post(f"{CLINPGX}/site/search",
                  json={"query": "TPMT azathioprine"}, timeout=15)
hits = r.json()["data"]["hits"]
print(f"TPMT+azathioprine hits: {len(hits)}")
for h in hits[:5]:
    print(f"  {h.get('id'):>15}  {h.get('name','')[:80]}")
```

### Module 2: Gene, drug, and variant record lookup

The `/data/{type}` endpoints accept simple property filters. All return `{"data": [...], "status": "success"}` — use `view=base` for summary, `view=max` for full nested objects.

```python
import requests

CLINPGX = "https://api.clinpgx.org/v1"

# Gene by HGNC symbol
gene = requests.get(f"{CLINPGX}/data/gene",
                    params={"symbol": "CYP2D6", "view": "base"}).json()["data"][0]
print(f"{gene['symbol']}  id={gene['id']}  {gene['name']}")

# Drug by name (lowercase generic preferred)
drug = requests.get(f"{CLINPGX}/data/drug",
                    params={"name": "warfarin", "view": "base"}).json()["data"][0]
print(f"{drug['name']}  id={drug['id']}")

# Variant by rsID
var = requests.get(f"{CLINPGX}/data/variant",
                   params={"name": "rs4149056", "view": "base"}).json()["data"][0]
print(f"{var['name']}  id={var['id']}  significance={var.get('clinicalSignificance')}")
```

```python
# Direct record fetch when you already have a PA ID
r = requests.get(f"{CLINPGX}/data/drug/PA449088", params={"view": "max"}).json()
d = r["data"]
print(f"PA449088 → {d['name']}  (objCls={d['objCls']})")
```

### Module 3: Clinical annotations

`data/clinicalAnnotation` records associate a variant (`location`) with one or more drugs (`relatedChemicals`) and an evidence level (`levelOfEvidence.term`). The two supported filters are `relatedChemicals.name=` and `levelOfEvidence.term=`. There is **no working `
sciagent-skill-creatorSkill

|

opentrons-integrationSkill

Opentrons Protocol API v2 for OT-2/Flex: Python protocols for pipetting, serial dilutions, PCR, plate replication; control thermocycler, heater-shaker, magnetic, temperature modules. Use pylabrobot for multi-vendor.

plotly-interactive-visualizationSkill

Interactive visualization with Plotly. 40+ chart types (scatter, line, heatmap, 3D, geographic) with hover, zoom, pan. Two APIs: Plotly Express (DataFrame) and Graph Objects (fine control). For static publication figures use matplotlib; for statistical grammar use seaborn.

seaborn-statistical-visualizationSkill

Statistical visualization on matplotlib + pandas. Distributions (histplot, kdeplot, violin, box), relational (scatter, line), categorical, regression, correlation heatmaps. Auto aggregation/CIs. Use plotly for interactive; matplotlib for low-level.

single-cell-annotationSkill

Best practices for single-cell RNA-seq cell type annotation including marker-based, reference-based, and automated classification approaches.

pymc-bayesian-modelingSkill

Bayesian modeling with PyMC 5: priors, likelihood, NUTS/ADVI sampling, diagnostics (R-hat, ESS), LOO/WAIC comparison, prediction. Hierarchical, logistic, GP variants; predictive checks.

scikit-survival-analysisSkill

Time-to-event modeling with scikit-survival: Cox PH (elastic net), Random Survival Forests, Boosting, SVMs for censored data. C-index, Brier, time-dependent AUC; Kaplan-Meier, Nelson-Aalen, competing risks. Pipeline/GridSearchCV compatible. Use statsmodels for frequentist, pymc for Bayesian, lifelines for parametric.

statistical-analysisSkill

>-