dspy
DSPy is a framework for building complex AI systems through declarative language model programming, enabling automatic prompt optimization and retrieval-augmented generation without manual engineering. Use this skill when constructing modular AI pipelines, agents, or classifiers that require systematic output improvement, particularly for question answering, reasoning tasks, and multi-component workflows where maintainable, data-driven optimization of language model behavior is needed.
git clone --depth 1 https://github.com/NousResearch/hermes-agent /tmp/dspy && cp -r /tmp/dspy/optional-skills/mlops/research/dspy ~/.claude/skills/dspySKILL.md
# DSPy: Declarative Language Model Programming
## When to Use This Skill
Use DSPy when you need to:
- **Build complex AI systems** with multiple components and workflows
- **Program LMs declaratively** instead of manual prompt engineering
- **Optimize prompts automatically** using data-driven methods
- **Create modular AI pipelines** that are maintainable and portable
- **Improve model outputs systematically** with optimizers
- **Build RAG systems, agents, or classifiers** with better reliability
**GitHub Stars**: 22,000+ | **Created By**: Stanford NLP
## Installation
```bash
# Stable release
pip install dspy
# Latest development version
pip install git+https://github.com/stanfordnlp/dspy.git
# With specific LM providers
pip install dspy[openai] # OpenAI
pip install dspy[anthropic] # Anthropic Claude
pip install dspy[all] # All providers
```
## Quick Start
### Basic Example: Question Answering
```python
import dspy
# Configure your language model
lm = dspy.Claude(model="claude-sonnet-4-5-20250929")
dspy.settings.configure(lm=lm)
# Define a signature (input → output)
class QA(dspy.Signature):
"""Answer questions with short factual answers."""
question = dspy.InputField()
answer = dspy.OutputField(desc="often between 1 and 5 words")
# Create a module
qa = dspy.Predict(QA)
# Use it
response = qa(question="What is the capital of France?")
print(response.answer) # "Paris"
```
### Chain of Thought Reasoning
```python
import dspy
lm = dspy.Claude(model="claude-sonnet-4-5-20250929")
dspy.settings.configure(lm=lm)
# Use ChainOfThought for better reasoning
class MathProblem(dspy.Signature):
"""Solve math word problems."""
problem = dspy.InputField()
answer = dspy.OutputField(desc="numerical answer")
# ChainOfThought generates reasoning steps automatically
cot = dspy.ChainOfThought(MathProblem)
response = cot(problem="If John has 5 apples and gives 2 to Mary, how many does he have?")
print(response.rationale) # Shows reasoning steps
print(response.answer) # "3"
```
## Core Concepts
### 1. Signatures
Signatures define the structure of your AI task (inputs → outputs):
```python
# Inline signature (simple)
qa = dspy.Predict("question -> answer")
# Class signature (detailed)
class Summarize(dspy.Signature):
"""Summarize text into key points."""
text = dspy.InputField()
summary = dspy.OutputField(desc="bullet points, 3-5 items")
summarizer = dspy.ChainOfThought(Summarize)
```
**When to use each:**
- **Inline**: Quick prototyping, simple tasks
- **Class**: Complex tasks, type hints, better documentation
### 2. Modules
Modules are reusable components that transform inputs to outputs:
#### dspy.Predict
Basic prediction module:
```python
predictor = dspy.Predict("context, question -> answer")
result = predictor(context="Paris is the capital of France",
question="What is the capital?")
```
#### dspy.ChainOfThought
Generates reasoning steps before answering:
```python
cot = dspy.ChainOfThought("question -> answer")
result = cot(question="Why is the sky blue?")
print(result.rationale) # Reasoning steps
print(result.answer) # Final answer
```
#### dspy.ReAct
Agent-like reasoning with tools:
```python
from dspy.predict import ReAct
class SearchQA(dspy.Signature):
"""Answer questions using search."""
question = dspy.InputField()
answer = dspy.OutputField()
def search_tool(query: str) -> str:
"""Search Wikipedia."""
# Your search implementation
return results
react = ReAct(SearchQA, tools=[search_tool])
result = react(question="When was Python created?")
```
#### dspy.ProgramOfThought
Generates and executes code for reasoning:
```python
pot = dspy.ProgramOfThought("question -> answer")
result = pot(question="What is 15% of 240?")
# Generates: answer = 240 * 0.15
```
### 3. Optimizers
Optimizers improve your modules automatically using training data:
#### BootstrapFewShot
Learns from examples:
```python
from dspy.teleprompt import BootstrapFewShot
# Training data
trainset = [
dspy.Example(question="What is 2+2?", answer="4").with_inputs("question"),
dspy.Example(question="What is 3+5?", answer="8").with_inputs("question"),
]
# Define metric
def validate_answer(example, pred, trace=None):
return example.answer == pred.answer
# Optimize
optimizer = BootstrapFewShot(metric=validate_answer, max_bootstrapped_demos=3)
optimized_qa = optimizer.compile(qa, trainset=trainset)
# Now optimized_qa performs better!
```
#### MIPRO (Most Important Prompt Optimization)
Iteratively improves prompts:
```python
from dspy.teleprompt import MIPRO
optimizer = MIPRO(
metric=validate_answer,
num_candidates=10,
init_temperature=1.0
)
optimized_cot = optimizer.compile(
cot,
trainset=trainset,
num_trials=100
)
```
#### BootstrapFinetune
Creates datasets for model fine-tuning:
```python
from dspy.teleprompt import BootstrapFinetune
optimizer = BootstrapFinetune(metric=validate_answer)
optimized_module = optimizer.compile(qa, trainset=trainset)
# Exports training data for fine-tuning
```
### 4. Building Complex Systems
#### Multi-Stage Pipeline
```python
import dspy
class MultiHopQA(dspy.Module):
def __init__(self):
super().__init__()
self.retrieve = dspy.Retrieve(k=3)
self.generate_query = dspy.ChainOfThought("question -> search_query")
self.generate_answer = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):
# Stage 1: Generate search query
search_query = self.generate_query(question=question).search_query
# Stage 2: Retrieve context
passages = self.retrieve(search_query).passages
context = "\n".join(passages)
# Stage 3: Generate answer
answer = self.generate_answer(context=context, question=question).answer
return dspy.Prediction(answer=answer, context=context)
# Use the pipeline
qa_system = MultiHopQAOperate the Antigravity CLI (agy): plugins, auth, sandbox.
Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key.
Delegate coding to xAI Grok Build CLI (features, PRs).
Configure and use Honcho memory with Hermes -- cross-session user modeling, multi-profile peer isolation, observation config, dialectic reasoning, session summaries, and context budget enforcement. Use when setting up Honcho, troubleshooting memory, managing profiles with Honcho peers, or tuning observation, recall, and dialectic settings.
Delegate coding to OpenHands CLI (model-agnostic, LiteLLM).
Read-only EVM client: wallets, tokens, gas across 8 chains.
Hyperliquid market data, account history, trade review.
Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required.