Skill118 repo starsupdated 1mo ago

dspy-optimize-anything

This Claude Code skill provides a universal optimization framework for any text-based artifact, code, prompts, configurations, and vector graphics, using GEPA's reflective evolutionary search algorithm. Use it when you need to optimize discrete problems with a measurable quality score and can provide diagnostic feedback to guide the optimization process, whether for single hard problems like algorithm discovery or batch-related tasks requiring generalization across unseen inputs.

View source Repository: dspy-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/OmidZamani/dspy-skills /tmp/dspy-optimize-anything && cp -r /tmp/dspy-optimize-anything/skills/dspy-optimize-anything ~/.claude/skills/dspy-optimize-anything

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# GEPA optimize_anything

## Goal

Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.

## When to Use

- **Beyond prompt optimization** — optimizing code, configs, SVGs, scheduling policies, etc.
- **Single hard problems** — circle packing, kernel generation, algorithm discovery
- **Batch related problems** — CUDA kernels, code generation tasks with cross-transfer
- **Generalization** — agent skills, policies, or prompts that must transfer to unseen inputs
- When you can **express quality as a score** and provide **diagnostic feedback** (ASI)

## Inputs

| Input | Type | Description |
|-------|------|-------------|
| `seed_candidate` | `str \| dict[str, str] \| None` | Starting artifact text, or `None` for seedless mode |
| `evaluator` | `Callable` | Returns score (higher=better), optionally with ASI dict |
| `dataset` | `list \| None` | Training examples (for multi-task and generalization modes) |
| `valset` | `list \| None` | Validation set (for generalization mode) |
| `objective` | `str \| None` | Natural language description of what to optimize for |
| `background` | `str \| None` | Domain knowledge and constraints |
| `config` | `GEPAConfig \| None` | Engine, reflection, and tracking settings |

## Outputs

| Output | Type | Description |
|--------|------|-------------|
| `result.best_candidate` | `str \| dict` | Best optimized artifact |

## Workflow

### Phase 1: Install

```bash
pip install -U "gepa>=0.1.1,<0.2"
```

### Phase 2: Define Evaluator with ASI

The evaluator scores a candidate and returns Actionable Side Information (ASI) — diagnostic feedback that guides the LLM proposer during reflection.

**Simple evaluator (score only):**

```python
import gepa.optimize_anything as oa
from gepa.optimize_anything import EngineConfig, GEPAConfig

config = GEPAConfig(engine=EngineConfig(max_metric_calls=100))

def evaluate(candidate: str) -> float:
    score, diagnostic = run_my_system(candidate)
    oa.log(f"Error: {diagnostic}")  # captured as ASI
    return score
```

**Rich evaluator (score + structured ASI):**

```python
def evaluate(candidate: str) -> tuple[float, dict]:
    result = execute_code(candidate)
    return result.score, {
        "Error": result.stderr,
        "Output": result.stdout,
        "Runtime": f"{result.time_ms:.1f}ms",
    }
```

ASI can include open-ended text, structured data, multi-objectives (via `scores`), or images (via `gepa.Image`) for vision-capable LLMs.

### Phase 3: Choose Optimization Mode

**Mode 1 — Single-Task Search:** Solve one hard problem. No dataset needed.

```python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    config=config,
)
```

**Mode 2 — Multi-Task Search:** Solve a batch of related problems with cross-transfer.

```python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=tasks,
    config=config,
)
```

**Mode 3 — Generalization:** Build a skill/prompt/policy that transfers to unseen problems.

```python
result = oa.optimize_anything(
    seed_candidate="<your initial artifact>",
    evaluator=evaluate,
    dataset=train,
    valset=val,
    config=config,
)
```

**Seedless mode:** Describe what you need instead of providing a seed.

```python
result = oa.optimize_anything(
    evaluator=evaluate,
    objective="Generate a Python function `reverse()` that reverses a string.",
    config=config,
)
```

### Phase 4: Use Results

```python
print(result.best_candidate)
```

## Production Example

```python
import gepa.optimize_anything as oa
from gepa import Image
from gepa.optimize_anything import EngineConfig, GEPAConfig
import logging

logger = logging.getLogger(__name__)

# ---------- SVG optimization with VLM feedback ----------

GOAL = "a pelican riding a bicycle"
VLM = "vertex_ai/gemini-3-flash-preview"

VISUAL_ASPECTS = [
    {"id": "overall",     "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"},
    {"id": "anatomy",     "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"},
    {"id": "bicycle",     "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"},
    {"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"},
]

def evaluate(candidate, example):
    """Render SVG, score with a VLM, return (score, ASI)."""
    image = render_image(candidate["svg_code"])  # via cairosvg
    score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])

    return score, {
        "RenderedSVG": Image(base64_data=image, media_type="image/png"),
        "Feedback": feedback,
    }

result = oa.optimize_anything(
    seed_candidate={"svg_code": "<svg>...</svg>"},
    evaluator=evaluate,
    dataset=VISUAL_ASPECTS,
    background=f"Optimize SVG source code depicting '{GOAL}'. "
               "Improve anatomy, composition, and visual quality.",
    config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)

logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")


# ---------- Code optimization (single-task) ----------

def evaluate_solver(candidate: str) -> tuple[float, dict]:
    """Evaluate a Python solver for a mathematical optimization problem."""
    import subprocess, json

    proc = subprocess.run(
        ["python", "-c", candidate],
        capture_output=True, text=True, timeout=30,
    )

    if proc.returncode != 0:
        oa.log(f"Runtime error: {proc.stderr}")
        return 0.0, {"Error": proc.stderr}

    try:
        output = json.loads(proc.stdout)
        return output["score"], {
            "Output": output.get("solution"),
            "Runtime": f"{output.get('time_ms', 0):.1f}ms",
        }
    except (json.JSONDecodeError, KeyError) as e:
        oa.log(f"Parse error: {e}")
        return 0.0, {"Error": str(e), "Stdo