chembl-database
The chembl-database skill enables programmatic queries of the ChEMBL bioactive molecules database, which contains over 2 million compounds and 19 million bioactivity measurements. Use this skill when searching for drug compounds by structure or name, retrieving bioactivity data like IC50 values, exploring target protein information, identifying inhibitors for specific molecular targets, or conducting structure activity relationship studies in medicinal chemistry and drug discovery research.
git clone --depth 1 https://github.com/beita6969/ScienceClaw /tmp/chembl-database && cp -r /tmp/chembl-database/skills/chembl-database ~/.claude/skills/chembl-databaseSKILL.md
# ChEMBL Database
## Overview
ChEMBL is a manually curated database of bioactive molecules maintained by the European Bioinformatics Institute (EBI), containing over 2 million compounds, 19 million bioactivity measurements, 13,000+ drug targets, and data on approved drugs and clinical candidates. Access and query this data programmatically using the ChEMBL Python client for drug discovery and medicinal chemistry research.
## When to Use This Skill
This skill should be used when:
- **Compound searches**: Finding molecules by name, structure, or properties
- **Target information**: Retrieving data about proteins, enzymes, or biological targets
- **Bioactivity data**: Querying IC50, Ki, EC50, or other activity measurements
- **Drug information**: Looking up approved drugs, mechanisms, or indications
- **Structure searches**: Performing similarity or substructure searches
- **Cheminformatics**: Analyzing molecular properties and drug-likeness
- **Target-ligand relationships**: Exploring compound-target interactions
- **Drug discovery**: Identifying inhibitors, agonists, or bioactive molecules
## Installation and Setup
### Python Client
The ChEMBL Python client is required for programmatic access:
```bash
uv pip install chembl_webresource_client
```
### Basic Usage Pattern
```python
from chembl_webresource_client.new_client import new_client
# Access different endpoints
molecule = new_client.molecule
target = new_client.target
activity = new_client.activity
drug = new_client.drug
```
## Core Capabilities
### 1. Molecule Queries
**Retrieve by ChEMBL ID:**
```python
molecule = new_client.molecule
aspirin = molecule.get('CHEMBL25')
```
**Search by name:**
```python
results = molecule.filter(pref_name__icontains='aspirin')
```
**Filter by properties:**
```python
# Find small molecules (MW <= 500) with favorable LogP
results = molecule.filter(
molecule_properties__mw_freebase__lte=500,
molecule_properties__alogp__lte=5
)
```
### 2. Target Queries
**Retrieve target information:**
```python
target = new_client.target
egfr = target.get('CHEMBL203')
```
**Search for specific target types:**
```python
# Find all kinase targets
kinases = target.filter(
target_type='SINGLE PROTEIN',
pref_name__icontains='kinase'
)
```
### 3. Bioactivity Data
**Query activities for a target:**
```python
activity = new_client.activity
# Find potent EGFR inhibitors
results = activity.filter(
target_chembl_id='CHEMBL203',
standard_type='IC50',
standard_value__lte=100,
standard_units='nM'
)
```
**Get all activities for a compound:**
```python
compound_activities = activity.filter(
molecule_chembl_id='CHEMBL25',
pchembl_value__isnull=False
)
```
### 4. Structure-Based Searches
**Similarity search:**
```python
similarity = new_client.similarity
# Find compounds similar to aspirin
similar = similarity.filter(
smiles='CC(=O)Oc1ccccc1C(=O)O',
similarity=85 # 85% similarity threshold
)
```
**Substructure search:**
```python
substructure = new_client.substructure
# Find compounds containing benzene ring
results = substructure.filter(smiles='c1ccccc1')
```
### 5. Drug Information
**Retrieve drug data:**
```python
drug = new_client.drug
drug_info = drug.get('CHEMBL25')
```
**Get mechanisms of action:**
```python
mechanism = new_client.mechanism
mechanisms = mechanism.filter(molecule_chembl_id='CHEMBL25')
```
**Query drug indications:**
```python
drug_indication = new_client.drug_indication
indications = drug_indication.filter(molecule_chembl_id='CHEMBL25')
```
## Query Workflow
### Workflow 1: Finding Inhibitors for a Target
1. **Identify the target** by searching by name:
```python
targets = new_client.target.filter(pref_name__icontains='EGFR')
target_id = targets[0]['target_chembl_id']
```
2. **Query bioactivity data** for that target:
```python
activities = new_client.activity.filter(
target_chembl_id=target_id,
standard_type='IC50',
standard_value__lte=100
)
```
3. **Extract compound IDs** and retrieve details:
```python
compound_ids = [act['molecule_chembl_id'] for act in activities]
compounds = [new_client.molecule.get(cid) for cid in compound_ids]
```
### Workflow 2: Analyzing a Known Drug
1. **Get drug information**:
```python
drug_info = new_client.drug.get('CHEMBL1234')
```
2. **Retrieve mechanisms**:
```python
mechanisms = new_client.mechanism.filter(molecule_chembl_id='CHEMBL1234')
```
3. **Find all bioactivities**:
```python
activities = new_client.activity.filter(molecule_chembl_id='CHEMBL1234')
```
### Workflow 3: Structure-Activity Relationship (SAR) Study
1. **Find similar compounds**:
```python
similar = new_client.similarity.filter(smiles='query_smiles', similarity=80)
```
2. **Get activities for each compound**:
```python
for compound in similar:
activities = new_client.activity.filter(
molecule_chembl_id=compound['molecule_chembl_id']
)
```
3. **Analyze property-activity relationships** using molecular properties from results.
## Filter Operators
ChEMBL supports Django-style query filters:
- `__exact` - Exact match
- `__iexact` - Case-insensitive exact match
- `__contains` / `__icontains` - Substring matching
- `__startswith` / `__endswith` - Prefix/suffix matching
- `__gt`, `__gte`, `__lt`, `__lte` - Numeric comparisons
- `__range` - Value in range
- `__in` - Value in list
- `__isnull` - Null/not null check
## Data Export and Analysis
Convert results to pandas DataFrame for analysis:
```python
import pandas as pd
activities = new_client.activity.filter(target_chembl_id='CHEMBL203')
df = pd.DataFrame(list(activities))
# Analyze results
print(df['standard_value'].describe())
print(df.groupby('standard_type').size())
```
## Performance Optimization
### Caching
The client automatically caches results for 24 hours. Configure caching:
```python
from chembl_webresource_client.settings importRoute plain-language requests for Pi, Claude Code, Codex, OpenCode, Gemini CLI, or ACP harness work into either OpenClaw ACP runtime sessions or direct acpx-driven sessions ("telephone game" flow). For coding-agent thread requests, read this skill first, then use only `sessions_spawn` for thread creation.
Use the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.
|
|
|
|
OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.