Skip to main content
ClaudeWave
Skill199 repo starsupdated 16d ago

quickgo-database

Query EBI QuickGO REST API for GO terms and protein annotations. Fetch term metadata by ID, search by keyword, walk ancestor/descendant hierarchies, download annotations filtered by taxon, evidence code, aspect. Use for GO resolution, ontology traversal, annotation retrieval before enrichment. Use gseapy-gene-enrichment for enrichment; uniprot-protein-database for proteins.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/jaechang-hits/SciAgent-Skills /tmp/quickgo-database && cp -r /tmp/quickgo-database/skills/genomics-bioinformatics/databases/quickgo-database ~/.claude/skills/quickgo-database
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# QuickGO Database

## Overview

QuickGO is the EBI's Gene Ontology annotation browser and REST API. It provides programmatic access to the GO ontology (terms, synonyms, hierarchies) and to the manually curated and electronic GO annotations for proteins across all species. The API is free, requires no authentication, and returns JSON responses. All endpoints live under `https://www.ebi.ac.uk/QuickGO/services/`.

## When to Use

- Resolving a GO term ID (e.g., `GO:0006915`) to its name, definition, and aspect (biological_process, molecular_function, cellular_component)
- Retrieving all GO annotations for a UniProt protein, filtered by evidence code and taxon
- Searching GO terms by keyword (e.g., "apoptosis") to find relevant term IDs before enrichment analysis
- Walking the GO DAG upward (ancestors) or downward (descendants) from a specific term
- Getting annotation counts stratified by evidence code or GO aspect for a set of proteins
- Resolving multiple GO IDs in one batch request to avoid looping over individual term lookups
- For enrichment analysis (ORA/GSEA) on a gene list use `gseapy-gene-enrichment`; QuickGO provides the raw annotation data
- For comprehensive protein function annotations in Swiss-Prot format use `uniprot-protein-database`

## Prerequisites

- **Python packages**: `requests`, `pandas`, `matplotlib`
- **Data requirements**: GO term IDs (`GO:XXXXXXX`) or UniProt accessions; taxon IDs (e.g., `9606` for human)
- **Environment**: internet connection; no API key required
- **Rate limits**: no published hard limit; use `time.sleep(1.0)` between requests in batch loops for polite access

```bash
pip install requests pandas matplotlib
```

## Quick Start

```python
import requests
import time

QUICKGO_BASE = "https://www.ebi.ac.uk/QuickGO/services"

def quickgo_get(endpoint: str, params: dict = None) -> dict:
    """Send a GET request to a QuickGO endpoint and return parsed JSON."""
    url = f"{QUICKGO_BASE}/{endpoint}"
    headers = {"Accept": "application/json"}
    r = requests.get(url, params=params, headers=headers, timeout=30)
    r.raise_for_status()
    return r.json()

# Fetch metadata for the apoptotic process GO term
result = quickgo_get("ontology/go/terms/GO:0006915")
term = result["results"][0]
print(f"ID     : {term['id']}")
print(f"Name   : {term['name']}")
print(f"Aspect : {term['aspect']}")
print(f"Def    : {term['definition']['text'][:100]}...")
# ID     : GO:0006915
# Name   : apoptotic process
# Aspect : biological_process
# Def    : A programmed cell death process which begins when a cell receives ...
```

## Core API

### Query 1: GO Term Lookup

Fetch term metadata — name, definition, aspect, synonyms, and is-obsolete status — for one or more GO IDs.

```python
import requests

QUICKGO_BASE = "https://www.ebi.ac.uk/QuickGO/services"

def get_go_term(go_id: str) -> dict:
    """Retrieve metadata for a single GO term by ID."""
    headers = {"Accept": "application/json"}
    r = requests.get(
        f"{QUICKGO_BASE}/ontology/go/terms/{go_id}",
        headers=headers, timeout=30
    )
    r.raise_for_status()
    results = r.json().get("results", [])
    return results[0] if results else {}

term = get_go_term("GO:0005515")
print(f"Name    : {term['name']}")
print(f"Aspect  : {term['aspect']}")
print(f"Obsolete: {term.get('isObsolete', False)}")
print(f"Synonyms: {[s['name'] for s in term.get('synonyms', [])[:3]]}")
# Name    : protein binding
# Aspect  : molecular_function
# Obsolete: False
# Synonyms: ['protein-protein interaction', 'protein binding activity']
```

```python
# Batch lookup: resolve multiple GO IDs in one request
go_ids = ["GO:0006915", "GO:0005515", "GO:0016020"]
ids_param = ",".join(go_ids)
r = requests.get(
    f"{QUICKGO_BASE}/ontology/go/terms/{ids_param}",
    headers={"Accept": "application/json"}, timeout=30
)
r.raise_for_status()
for t in r.json().get("results", []):
    print(f"{t['id']}  {t['aspect']:<25}  {t['name']}")
# GO:0006915  biological_process        apoptotic process
# GO:0005515  molecular_function        protein binding
# GO:0016020  cellular_component        membrane
```

### Query 2: Annotation Search

Retrieve GO annotations for a protein or a set of proteins. Filter by evidence code and taxon.

```python
import requests

QUICKGO_BASE = "https://www.ebi.ac.uk/QuickGO/services"

def get_protein_annotations(uniprot_id: str, evidence_codes: list = None,
                             limit: int = 100) -> list:
    """Fetch GO annotations for a UniProt protein."""
    params = {
        "geneProductId": f"UniProtKB:{uniprot_id}",
        "limit": limit,
        "page": 1,
    }
    if evidence_codes:
        params["evidenceCode"] = ",".join(evidence_codes)
    headers = {"Accept": "application/json"}
    r = requests.get(
        f"{QUICKGO_BASE}/annotation/search",
        params=params, headers=headers, timeout=30
    )
    r.raise_for_status()
    return r.json().get("results", [])

# Fetch experimental annotations for TP53 (P04637)
annotations = get_protein_annotations(
    "P04637",
    evidence_codes=["EXP", "IDA", "IPI", "IMP", "IGI", "IEP"]
)
print(f"Experimental annotations for TP53: {len(annotations)}")
for ann in annotations[:4]:
    print(f"  {ann['goId']}  {ann['goName']:<40}  {ann['evidenceCode']}")
# Experimental annotations for TP53: 87
#   GO:0006977  DNA damage response, ...          IDA
#   GO:0043065  positive regulation of apoptosis  IMP
```

```python
# Annotations for a taxon (human, 9606) + specific GO term
params = {
    "goId": "GO:0006915",
    "taxonId": "9606",
    "evidenceCode": "EXP,IDA,IPI,IMP,IGI,IEP",
    "limit": 100,
    "page": 1,
}
r = requests.get(
    f"{QUICKGO_BASE}/annotation/search",
    params=params,
    headers={"Accept": "application/json"},
    timeout=30
)
r.raise_for_status()
data = r.json()
print(f"Total annotations: {data.get('numberOfHits', 'N/A')}")
print(f"Retrieved         : {len(data.get('results', []))}")
for ann in data["results"][:3]:
    print(f"
sciagent-skill-creatorSkill

|

opentrons-integrationSkill

Opentrons Protocol API v2 for OT-2/Flex: Python protocols for pipetting, serial dilutions, PCR, plate replication; control thermocycler, heater-shaker, magnetic, temperature modules. Use pylabrobot for multi-vendor.

plotly-interactive-visualizationSkill

Interactive visualization with Plotly. 40+ chart types (scatter, line, heatmap, 3D, geographic) with hover, zoom, pan. Two APIs: Plotly Express (DataFrame) and Graph Objects (fine control). For static publication figures use matplotlib; for statistical grammar use seaborn.

seaborn-statistical-visualizationSkill

Statistical visualization on matplotlib + pandas. Distributions (histplot, kdeplot, violin, box), relational (scatter, line), categorical, regression, correlation heatmaps. Auto aggregation/CIs. Use plotly for interactive; matplotlib for low-level.

single-cell-annotationSkill

Best practices for single-cell RNA-seq cell type annotation including marker-based, reference-based, and automated classification approaches.

pymc-bayesian-modelingSkill

Bayesian modeling with PyMC 5: priors, likelihood, NUTS/ADVI sampling, diagnostics (R-hat, ESS), LOO/WAIC comparison, prediction. Hierarchical, logistic, GP variants; predictive checks.

scikit-survival-analysisSkill

Time-to-event modeling with scikit-survival: Cox PH (elastic net), Random Survival Forests, Boosting, SVMs for censored data. C-index, Brier, time-dependent AUC; Kaplan-Meier, Nelson-Aalen, competing risks. Pipeline/GridSearchCV compatible. Use statsmodels for frequentist, pymc for Bayesian, lifelines for parametric.

statistical-analysisSkill

>-