Skip to main content
ClaudeWave
Skill199 estrellas del repoactualizado 16d ago

mouse-phenome-database

Retrieve mouse phenotype data from the Jackson Laboratory Mouse Phenome Database (MPD) via its REST API. Browse 520+ projects, look up per-project measure metadata, pull strain-level means (raw or LS-mean adjusted) and per-animal values, find measures by MP/VT ontology terms, and resolve strain nomenclature or gene coordinates. Use for QTL support, cross-strain comparison, mouse model selection, and ontology-driven phenotype discovery. Use monarch-database for disease-gene-phenotype knowledge graphs; ensembl-database for mouse genome annotations.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/jaechang-hits/SciAgent-Skills /tmp/mouse-phenome-database && cp -r /tmp/mouse-phenome-database/skills/genomics-bioinformatics/databases/mouse-phenome-database ~/.claude/skills/mouse-phenome-database
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# mouse-phenome-database

## Overview

The Mouse Phenome Database (MPD), maintained at the Jackson Laboratory, catalogs standardized phenotype measurements across inbred, recombinant inbred (e.g., BXD), and Collaborative Cross / Diversity Outbred mouse panels. It aggregates 520+ projects spanning metabolic, cardiovascular, behavioral, hematological, and immunological traits. The REST API at `https://phenome.jax.org/api` is free, requires no authentication, and is documented at <https://phenome.jax.org/about/api>. MPD measurement IDs (`measnum`) are project-scoped 5-digit integers — there is no global "measnum 10001 = body weight" mapping; valid measnums must be discovered per project via the `measureinfo` endpoint.

## When to Use

- Selecting inbred strains with extreme phenotypes (highest/lowest fasted glucose, body weight, heart rate, etc.) as experimental models
- Pulling individual-animal data from BXD / CC / DO panels for QTL mapping with R/qtl2 or similar tools
- Comparing strain means and variance across metabolic, behavioral, or cardiovascular measures for genetic background studies
- Finding MPD projects that measure a trait of interest using ontology terms (MP, VT, MA) or free-text descriptions
- Validating mouse strain nomenclature (canonical JAX names ↔ stock numbers ↔ MGI IDs) before submitting orders or analyses
- Looking up coordinates and annotations for mouse genes in the MPD/MGI cross-reference
- Use `monarch-database` instead for disease-gene-phenotype knowledge graphs (HPO ↔ MP ↔ disease)
- Use `ensembl-database` instead for transcript-level mouse gene annotations and variant consequence prediction

## Prerequisites

- **Python packages**: `requests`, `pandas`, `matplotlib`
- **Data requirements**: a project symbol (e.g., `Jaxwest1`, `Auwerx1`) or a measnum (e.g., `15101`); strain names follow JAX canonical nomenclature (e.g., `C57BL/6J`, `DBA/2J`)
- **Environment**: internet connection; no API key required
- **Rate limits**: no published hard limit; keep bursts under ~5 requests/second and add `time.sleep(0.3)` between requests in loops

```bash
pip install requests pandas matplotlib
```

## Quick Start

```python
import requests

MPD = "https://phenome.jax.org/api"

# 1) Pick a project (Jaxwest1 — cardiovascular phenotyping on inbred panel)
r = requests.get(f"{MPD}/projects/Jaxwest1/strains", timeout=30)
strains = r.json()["strains"]
print(f"Jaxwest1: {len(strains)} strains tested")

# 2) Discover its measures
r = requests.get(f"{MPD}/pheno/measureinfo/Jaxwest1", timeout=30)
measures = r.json()["measures_info"]
print(f"Jaxwest1 measures: {len(measures)}; first: measnum={measures[0]['measnum']} "
      f"varname={measures[0]['varname']}  ({measures[0]['descrip']}, {measures[0]['units']})")

# 3) Pull strain means for heart rate (varname=HR, measnum=15101)
r = requests.get(f"{MPD}/pheno/strainmeans/15101", timeout=30)
sm = r.json()["strainmeans"]
print(f"\nHeart rate strain means: {len(sm)} rows  (one per strain × sex)")
top = sorted(sm, key=lambda x: x["mean"], reverse=True)[:5]
for s in top:
    print(f"  {s['strain']:<20}  sex={s['sex']}  mean={s['mean']:.0f} {s.get('varname','')}  n={s['nmice']}")
```

## Core API

### Module 1: Browse Projects — `/projects`

Lists all MPD projects with full metadata. Filter via `investigator`, `projsym`, `projid`, `mpdsector`, `largecollab`, `panelsym`. Use `/project_filters/{filtername}` to see the allowed values of `mpdsector`, `largecollab`, or `panelsym` before filtering.

```python
import requests, pandas as pd

MPD = "https://phenome.jax.org/api"

# List allowed panel symbols (e.g., BXD, CC, DO)
filters = requests.get(f"{MPD}/project_filters/panelsym", timeout=30).json()
print(f"Available panels ({filters['count']}):", [t['term'] for t in filters['terms']][:10])

# All projects in the BXD recombinant inbred panel
r = requests.get(f"{MPD}/projects", params={"panelsym": "BXD"}, timeout=30)
projects = r.json()["projects"]
print(f"BXD projects: {len(projects)}")
df = pd.DataFrame([{
    "projsym": p["projsym"],
    "pi": p.get("pistring", "")[:40],
    "nstrains": p.get("nstrains"),
    "ages": p.get("ages"),
    "sector": p.get("mpdsector"),
    "title": (p.get("title") or "")[:60],
} for p in projects])
print(df.head(10).to_string(index=False))
```

```python
# Filter by MPD sector — komp, pheno, qtla, snp, onestrain, phenoarchive
r = requests.get(f"{MPD}/projects", params={"mpdsector": "qtla"}, timeout=30)
qtl_projects = r.json()["projects"]
print(f"QTL-archive projects: {len(qtl_projects)}")
for p in qtl_projects[:5]:
    print(f"  {p['projsym']:<15} panel={p.get('panelsym') or '--':<6} nstrains={str(p.get('nstrains') or '--'):>4}  {(p.get('title') or '')[:55]}")
```

### Module 2: Project Detail — `/projects/{projsym}/...`

Each project has sub-resources for its dataset (CSV of every animal × every measure), the strain panel it tested, the publications it produced, and (for QTL projects) the genetic markers used.

```python
import requests, io, pandas as pd

MPD = "https://phenome.jax.org/api"

# Full per-animal dataset as CSV (default). Use json=yes for JSON.
r = requests.get(f"{MPD}/projects/Jaxwest1/dataset", timeout=60)
df = pd.read_csv(io.StringIO(r.text))
print(f"Jaxwest1 dataset: {df.shape[0]} animals × {df.shape[1]} columns")
print(df.columns[:12].tolist())
print(df[["strain", "sex", "animal_id", "HR", "QRS", "bw"]].head(5).to_string(index=False))
```

```python
# Strains tested in a project + publication list
strains = requests.get(f"{MPD}/projects/Jaxwest1/strains", timeout=30).json()
print(f"Jaxwest1 strains ({strains['count']}):")
for s in strains["strains"][:5]:
    print(f"  {s['strainname']:<20}  stock={s['stocknum']}  vendor={s['vendor']}")

pubs = requests.get(f"{MPD}/projects/Jaxwest1/publications", timeout=30).json()
print(f"\nPublications: {pubs['count']}")
```

### Module 3: Measure Discovery — `/pheno/measureinfo/{selector}`

This is the canonical way to discover valid `measnum` values. The selector
sciagent-skill-creatorSkill

|

opentrons-integrationSkill

Opentrons Protocol API v2 for OT-2/Flex: Python protocols for pipetting, serial dilutions, PCR, plate replication; control thermocycler, heater-shaker, magnetic, temperature modules. Use pylabrobot for multi-vendor.

plotly-interactive-visualizationSkill

Interactive visualization with Plotly. 40+ chart types (scatter, line, heatmap, 3D, geographic) with hover, zoom, pan. Two APIs: Plotly Express (DataFrame) and Graph Objects (fine control). For static publication figures use matplotlib; for statistical grammar use seaborn.

seaborn-statistical-visualizationSkill

Statistical visualization on matplotlib + pandas. Distributions (histplot, kdeplot, violin, box), relational (scatter, line), categorical, regression, correlation heatmaps. Auto aggregation/CIs. Use plotly for interactive; matplotlib for low-level.

single-cell-annotationSkill

Best practices for single-cell RNA-seq cell type annotation including marker-based, reference-based, and automated classification approaches.

pymc-bayesian-modelingSkill

Bayesian modeling with PyMC 5: priors, likelihood, NUTS/ADVI sampling, diagnostics (R-hat, ESS), LOO/WAIC comparison, prediction. Hierarchical, logistic, GP variants; predictive checks.

scikit-survival-analysisSkill

Time-to-event modeling with scikit-survival: Cox PH (elastic net), Random Survival Forests, Boosting, SVMs for censored data. C-index, Brier, time-dependent AUC; Kaplan-Meier, Nelson-Aalen, competing risks. Pipeline/GridSearchCV compatible. Use statsmodels for frequentist, pymc for Bayesian, lifelines for parametric.

statistical-analysisSkill

>-