biorxiv-database
The bioRxiv Database skill provides Python-based search and retrieval tools for accessing preprints from the bioRxiv server by keywords, authors, date ranges, and subject categories. Use this skill when conducting literature reviews, tracking author publications, analyzing research trends, retrieving citation metadata, downloading full-text PDFs, or filtering papers by specific life sciences research areas.
git clone --depth 1 https://github.com/Microck/ordinary-claude-skills /tmp/biorxiv-database && cp -r /tmp/biorxiv-database/skills_all/biorxiv-database ~/.claude/skills/biorxiv-databaseSKILL.md
# bioRxiv Database
## Overview
This skill provides efficient Python-based tools for searching and retrieving preprints from the bioRxiv database. It enables comprehensive searches by keywords, authors, date ranges, and categories, returning structured JSON metadata that includes titles, abstracts, DOIs, and citation information. The skill also supports PDF downloads for full-text analysis.
## When to Use This Skill
Use this skill when:
- Searching for recent preprints in specific research areas
- Tracking publications by particular authors
- Conducting systematic literature reviews
- Analyzing research trends over time periods
- Retrieving metadata for citation management
- Downloading preprint PDFs for analysis
- Filtering papers by bioRxiv subject categories
## Core Search Capabilities
### 1. Keyword Search
Search for preprints containing specific keywords in titles, abstracts, or author lists.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--keywords "CRISPR" "gene editing" \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--output results.json
```
**With Category Filter:**
```python
python scripts/biorxiv_search.py \
--keywords "neural networks" "deep learning" \
--days-back 180 \
--category neuroscience \
--output recent_neuroscience.json
```
**Search Fields:**
By default, keywords are searched in both title and abstract. Customize with `--search-fields`:
```python
python scripts/biorxiv_search.py \
--keywords "AlphaFold" \
--search-fields title \
--days-back 365
```
### 2. Author Search
Find all papers by a specific author within a date range.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--author "Smith" \
--start-date 2023-01-01 \
--end-date 2024-12-31 \
--output smith_papers.json
```
**Recent Publications:**
```python
# Last year by default if no dates specified
python scripts/biorxiv_search.py \
--author "Johnson" \
--output johnson_recent.json
```
### 3. Date Range Search
Retrieve all preprints posted within a specific date range.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--start-date 2024-01-01 \
--end-date 2024-01-31 \
--output january_2024.json
```
**With Category Filter:**
```python
python scripts/biorxiv_search.py \
--start-date 2024-06-01 \
--end-date 2024-06-30 \
--category genomics \
--output genomics_june.json
```
**Days Back Shortcut:**
```python
# Last 30 days
python scripts/biorxiv_search.py \
--days-back 30 \
--output last_month.json
```
### 4. Paper Details by DOI
Retrieve detailed metadata for a specific preprint.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--output paper_details.json
```
**Full DOI URLs Accepted:**
```python
python scripts/biorxiv_search.py \
--doi "https://doi.org/10.1101/2024.01.15.123456"
```
### 5. PDF Downloads
Download the full-text PDF of any preprint.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--download-pdf paper.pdf
```
**Batch Processing:**
For multiple PDFs, extract DOIs from a search result JSON and download each paper:
```python
import json
from biorxiv_search import BioRxivSearcher
# Load search results
with open('results.json') as f:
data = json.load(f)
searcher = BioRxivSearcher(verbose=True)
# Download each paper
for i, paper in enumerate(data['results'][:10]): # First 10 papers
doi = paper['doi']
searcher.download_pdf(doi, f"papers/paper_{i+1}.pdf")
```
## Valid Categories
Filter searches by bioRxiv subject categories:
- `animal-behavior-and-cognition`
- `biochemistry`
- `bioengineering`
- `bioinformatics`
- `biophysics`
- `cancer-biology`
- `cell-biology`
- `clinical-trials`
- `developmental-biology`
- `ecology`
- `epidemiology`
- `evolutionary-biology`
- `genetics`
- `genomics`
- `immunology`
- `microbiology`
- `molecular-biology`
- `neuroscience`
- `paleontology`
- `pathology`
- `pharmacology-and-toxicology`
- `physiology`
- `plant-biology`
- `scientific-communication-and-education`
- `synthetic-biology`
- `systems-biology`
- `zoology`
## Output Format
All searches return structured JSON with the following format:
```json
{
"query": {
"keywords": ["CRISPR"],
"start_date": "2024-01-01",
"end_date": "2024-12-31",
"category": "genomics"
},
"result_count": 42,
"results": [
{
"doi": "10.1101/2024.01.15.123456",
"title": "Paper Title Here",
"authors": "Smith J, Doe J, Johnson A",
"author_corresponding": "Smith J",
"author_corresponding_institution": "University Example",
"date": "2024-01-15",
"version": "1",
"type": "new results",
"license": "cc_by",
"category": "genomics",
"abstract": "Full abstract text...",
"pdf_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1.full.pdf",
"html_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1",
"jatsxml": "https://www.biorxiv.org/content/...",
"published": ""
}
]
}
```
## Common Usage Patterns
### Literature Review Workflow
1. **Broad keyword search:**
```python
python scripts/biorxiv_search.py \
--keywords "organoids" "tissue engineering" \
--start-date 2023-01-01 \
--end-date 2024-12-31 \
--category bioengineering \
--output organoid_papers.json
```
2. **Extract and review results:**
```python
import json
with open('organoid_papers.json') as f:
data = json.load(f)
print(f"Found {data['result_count']} papers")
for paper in data['results'][:5]:
print(f"\nTitle: {paper['title']}")
print(f"Authors: {paper['authors']}")
print(f"Date: {paper['date']}")
print(f"DOI: {paper['doi']}")
```
3. **Download selected papers:**
```python
from biorxiv_search import BioRxivSearcher
searcher = BioRxivSearcher()
selected_dois = ["10.1101/2024.01.15.123456", "10.1101/2024.02.20.789012"]
for doi in selected_dois:
filename = doi.replace("/", "_"Testing patterns for PHPUnit and Playwright E2E tests. Use when writing tests, debugging test failures, setting up test coverage, or implementing test patterns for ActivityPub features.
Cloud laboratory platform for automated protein testing and validation. Use when designing proteins and needing experimental validation including binding assays, expression testing, thermostability measurements, enzyme activity assays, or protein sequence optimization. Also use for submitting experiments via API, tracking experiment status, downloading results, optimizing protein sequences for better expression using computational tools (NetSolP, SoluProt, SolubleMPNN, ESM), or managing protein design workflows with wet-lab validation.
Add unsigned integer (uint) type support to PyTorch operators by updating AT_DISPATCH macros. Use when adding support for uint16, uint32, uint64 types to operators, kernels, or when user mentions enabling unsigned types, barebones unsigned types, or uint support.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
Master advanced AgentDB features including QUIC synchronization, multi-database management, custom distance metrics, hybrid search, and distributed systems integration. Use when building distributed AI systems, multi-agent coordination, or advanced vector search applications.
Create and train AI learning plugins with AgentDB's 9 reinforcement learning algorithms. Includes Decision Transformer, Q-Learning, SARSA, Actor-Critic, and more. Use when building self-learning agents, implementing RL, or optimizing agent behavior through experience.
Implement persistent memory patterns for AI agents using AgentDB. Includes session memory, long-term storage, pattern learning, and context management. Use when building stateful agents, chat systems, or intelligent assistants.
Optimize AgentDB performance with quantization (4-32x memory reduction), HNSW indexing (150x faster search), caching, and batch operations. Use when optimizing memory usage, improving search speed, or scaling to millions of vectors.