Skill286 estrellas del repoactualizado 5d ago

mouse-phenome-database

The Mouse Phenome Database skill queries the Jackson Laboratory's REST API to retrieve standardized phenotype measurements across 520+ projects of inbred and recombinant mouse strains, including individual animal values, strain-level means, and metadata filtered by ontology terms. Use it to select mouse models with extreme phenotypes, compare strains for QTL mapping or genetic background studies, discover phenotype projects via ontology searching, or validate strain nomenclature and gene annotations.

Ver fuente Repositorio: SciAgent-Skills

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/jaechang-hits/SciAgent-Skills /tmp/mouse-phenome-database && cp -r /tmp/mouse-phenome-database/skills/genomics-bioinformatics/databases/mouse-phenome-database ~/.claude/skills/mouse-phenome-database

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# mouse-phenome-database

## Overview

The Mouse Phenome Database (MPD), maintained at the Jackson Laboratory, catalogs standardized phenotype measurements across inbred, recombinant inbred (e.g., BXD), and Collaborative Cross / Diversity Outbred mouse panels. It aggregates 520+ projects spanning metabolic, cardiovascular, behavioral, hematological, and immunological traits. The REST API at `https://phenome.jax.org/api` is free, requires no authentication, and is documented at <https://phenome.jax.org/about/api>. MPD measurement IDs (`measnum`) are project-scoped 5-digit integers — there is no global "measnum 10001 = body weight" mapping; valid measnums must be discovered per project via the `measureinfo` endpoint.

## When to Use

- Selecting inbred strains with extreme phenotypes (highest/lowest fasted glucose, body weight, heart rate, etc.) as experimental models
- Pulling individual-animal data from BXD / CC / DO panels for QTL mapping with R/qtl2 or similar tools
- Comparing strain means and variance across metabolic, behavioral, or cardiovascular measures for genetic background studies
- Finding MPD projects that measure a trait of interest using ontology terms (MP, VT, MA) or free-text descriptions
- Validating mouse strain nomenclature (canonical JAX names ↔ stock numbers ↔ MGI IDs) before submitting orders or analyses
- Looking up coordinates and annotations for mouse genes in the MPD/MGI cross-reference
- Use `monarch-database` instead for disease-gene-phenotype knowledge graphs (HPO ↔ MP ↔ disease)
- Use `ensembl-database` instead for transcript-level mouse gene annotations and variant consequence prediction

## Prerequisites

- **Python packages**: `requests`, `pandas`, `matplotlib`
- **Data requirements**: a project symbol (e.g., `Jaxwest1`, `Auwerx1`) or a measnum (e.g., `15101`); strain names follow JAX canonical nomenclature (e.g., `C57BL/6J`, `DBA/2J`)
- **Environment**: internet connection; no API key required
- **Rate limits**: no published hard limit; keep bursts under ~5 requests/second and add `time.sleep(0.3)` between requests in loops

```bash
pip install requests pandas matplotlib
```

## Quick Start

```python
import requests

MPD = "https://phenome.jax.org/api"

# 1) Pick a project (Jaxwest1 — cardiovascular phenotyping on inbred panel)
r = requests.get(f"{MPD}/projects/Jaxwest1/strains", timeout=30)
strains = r.json()["strains"]
print(f"Jaxwest1: {len(strains)} strains tested")

# 2) Discover its measures
r = requests.get(f"{MPD}/pheno/measureinfo/Jaxwest1", timeout=30)
measures = r.json()["measures_info"]
print(f"Jaxwest1 measures: {len(measures)}; first: measnum={measures[0]['measnum']} "
      f"varname={measures[0]['varname']}  ({measures[0]['descrip']}, {measures[0]['units']})")

# 3) Pull strain means for heart rate (varname=HR, measnum=15101)
r = requests.get(f"{MPD}/pheno/strainmeans/15101", timeout=30)
sm = r.json()["strainmeans"]
print(f"\nHeart rate strain means: {len(sm)} rows  (one per strain × sex)")
top = sorted(sm, key=lambda x: x["mean"], reverse=True)[:5]
for s in top:
    print(f"  {s['strain']:<20}  sex={s['sex']}  mean={s['mean']:.0f} {s.get('varname','')}  n={s['nmice']}")
```

## Core API

### Module 1: Browse Projects — `/projects`

Lists all MPD projects with full metadata. Filter via `investigator`, `projsym`, `projid`, `mpdsector`, `largecollab`, `panelsym`. Use `/project_filters/{filtername}` to see the allowed values of `mpdsector`, `largecollab`, or `panelsym` before filtering.

```python
import requests, pandas as pd

MPD = "https://phenome.jax.org/api"

# List allowed panel symbols (e.g., BXD, CC, DO)
filters = requests.get(f"{MPD}/project_filters/panelsym", timeout=30).json()
print(f"Available panels ({filters['count']}):", [t['term'] for t in filters['terms']][:10])

# All projects in the BXD recombinant inbred panel
r = requests.get(f"{MPD}/projects", params={"panelsym": "BXD"}, timeout=30)
projects = r.json()["projects"]
print(f"BXD projects: {len(projects)}")
df = pd.DataFrame([{
    "projsym": p["projsym"],
    "pi": p.get("pistring", "")[:40],
    "nstrains": p.get("nstrains"),
    "ages": p.get("ages"),
    "sector": p.get("mpdsector"),
    "title": (p.get("title") or "")[:60],
} for p in projects])
print(df.head(10).to_string(index=False))
```

```python
# Filter by MPD sector — komp, pheno, qtla, snp, onestrain, phenoarchive
r = requests.get(f"{MPD}/projects", params={"mpdsector": "qtla"}, timeout=30)
qtl_projects = r.json()["projects"]
print(f"QTL-archive projects: {len(qtl_projects)}")
for p in qtl_projects[:5]:
    print(f"  {p['projsym']:<15} panel={p.get('panelsym') or '--':<6} nstrains={str(p.get('nstrains') or '--'):>4}  {(p.get('title') or '')[:55]}")
```

### Module 2: Project Detail — `/projects/{projsym}/...`

Each project has sub-resources for its dataset (CSV of every animal × every measure), the strain panel it tested, the publications it produced, and (for QTL projects) the genetic markers used.

```python
import requests, io, pandas as pd

MPD = "https://phenome.jax.org/api"

# Full per-animal dataset as CSV (default). Use json=yes for JSON.
r = requests.get(f"{MPD}/projects/Jaxwest1/dataset", timeout=60)
df = pd.read_csv(io.StringIO(r.text))
print(f"Jaxwest1 dataset: {df.shape[0]} animals × {df.shape[1]} columns")
print(df.columns[:12].tolist())
print(df[["strain", "sex", "animal_id", "HR", "QRS", "bw"]].head(5).to_string(index=False))
```

```python
# Strains tested in a project + publication list
strains = requests.get(f"{MPD}/projects/Jaxwest1/strains", timeout=30).json()
print(f"Jaxwest1 strains ({strains['count']}):")
for s in strains["strains"][:5]:
    print(f"  {s['strainname']:<20}  stock={s['stocknum']}  vendor={s['vendor']}")

pubs = requests.get(f"{MPD}/projects/Jaxwest1/publications", timeout=30).json()
print(f"\nPublications: {pubs['count']}")
```

### Module 3: Measure Discovery — `/pheno/measureinfo/{selector}`

This is the canonical way to discover valid `measnum` values. The selector

Del mismo repositorio

sciagent-skill-creatorSkill

opentrons-integrationSkill

Opentrons Protocol API v2 for OT-2/Flex: Python protocols for pipetting, serial dilutions, PCR, plate replication; control thermocycler, heater-shaker, magnetic, temperature modules. Use pylabrobot for multi-vendor.

plotly-interactive-visualizationSkill

Interactive visualization with Plotly. 40+ chart types (scatter, line, heatmap, 3D, geographic) with hover, zoom, pan. Two APIs: Plotly Express (DataFrame) and Graph Objects (fine control). For static publication figures use matplotlib; for statistical grammar use seaborn.

seaborn-statistical-visualizationSkill

Statistical visualization on matplotlib + pandas. Distributions (histplot, kdeplot, violin, box), relational (scatter, line), categorical, regression, correlation heatmaps. Auto aggregation/CIs. Use plotly for interactive; matplotlib for low-level.

single-cell-annotationSkill

Best practices for single-cell RNA-seq cell type annotation including marker-based, reference-based, and automated classification approaches.

pymc-bayesian-modelingSkill

Bayesian modeling with PyMC 5: priors, likelihood, NUTS/ADVI sampling, diagnostics (R-hat, ESS), LOO/WAIC comparison, prediction. Hierarchical, logistic, GP variants; predictive checks.

scikit-survival-analysisSkill

Time-to-event modeling with scikit-survival: Cox PH (elastic net), Random Survival Forests, Boosting, SVMs for censored data. C-index, Brier, time-dependent AUC; Kaplan-Meier, Nelson-Aalen, competing risks. Pipeline/GridSearchCV compatible. Use statsmodels for frequentist, pymc for Bayesian, lifelines for parametric.

statistical-analysisSkill