Skill6.7k repo starsupdated today

imagegen

imagegen is a Claude Code skill that generates new images or edits existing ones through BlockRun's image API. Use it when users request image creation, drawing, or modifications to existing images, with support for multiple models ranging from budget options like Grok Imagine at $0.02 to high-resolution Banana Pro at $0.10–$0.15, triggered via slash command or natural language request.

View source Repository: ClawRouter

Install in Claude Code

Copy

git clone --depth 1 https://github.com/BlockRunAI/ClawRouter /tmp/imagegen && cp -r /tmp/imagegen/skills/imagegen ~/.claude/skills/imagegen

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Image Generation & Editing

Generate or edit images through ClawRouter. Payment is automatic via x402.

**Shortcuts:**

- Slash: `/cr-imagegen <prompt> [--model=<alias>] [--size=1024x1024] [--n=1]` (`/imagegen` still accepted in chat for backward compatibility)
- Partner tool: `blockrun_image_generation` (LLM-callable) / `blockrun_image_edit` (inpainting)

---

## Generate an Image

POST to `http://localhost:8402/v1/images/generations`:

```json
{
  "model": "google/nano-banana",
  "prompt": "a golden retriever surfing on a wave",
  "size": "1024x1024",
  "n": 1
}
```

Response:

```json
{
  "created": 1741460000,
  "data": [{ "url": "http://localhost:8402/images/abc123.png" }]
}
```

Display inline: `![generated image](http://localhost:8402/images/abc123.png)`

### Model Selection

| Alias              | Full ID                      | Price        | Sizes                           | Best for                                                         |
| ------------------ | ---------------------------- | ------------ | ------------------------------- | ---------------------------------------------------------------- |
| `nano-banana`      | `google/nano-banana`         | $0.05        | 1024×1024, 1216×832, 1024×1792  | Default — fast, cheap, good quality                              |
| `banana-pro`       | `google/nano-banana-pro`     | $0.10–$0.15  | up to 4096×4096                 | High-res, large format                                           |
| `dalle`            | `openai/dall-e-3`            | $0.04–$0.08  | 1024×1024, 1792×1024, 1024×1792 | Photorealistic, complex scenes                                   |
| `gpt-image`        | `openai/gpt-image-1`         | $0.02–$0.04  | 1024×1024, 1536×1024, 1024×1536 | Budget option; supports editing                                  |
| —                  | `openai/gpt-image-2`         | $0.06–$0.12  | 1024×1024, 1536×1024, 1024×1536 | Reasoning-driven, text rendering (slow — proxy polls up to 5min) |
| `flux`             | `black-forest/flux-1.1-pro`  | $0.04        | 1024×1024, 1216×832, 832×1216   | Artistic styles, fewer restrictions                              |
| `grok-imagine`     | `xai/grok-imagine-image`     | $0.02        | 1024×1024                       | xAI Grok image style                                             |
| `grok-imagine-pro` | `xai/grok-imagine-image-pro` | $0.07        | 1024×1024                       | Grok high-quality                                                |
| `cogview`          | `zai/cogview-4`              | $0.015–$0.02 | 512×512 to 1440×1440            | Cheapest — Zhipu CogView                                         |

**Choosing a model:**

- Default → `nano-banana`
- "high res" / "large" → `banana-pro`
- "photorealistic" / "dall-e" → `dalle`
- "budget" / "cheap" → `cogview`
- "editable" / "inpainting" → `gpt-image` (only edit-capable model)
- "artistic" / flexible content → `flux`
- "grok style" → `grok-imagine` or `grok-imagine-pro`

**Choosing a size:**

- Default: `1024x1024`
- Portrait: `1024x1792`
- Landscape: `1792x1024` (dall-e-3) or `1216x832` (nano-banana / flux)
- High-res: `2048x2048` or `4096x4096` with `banana-pro` only

---

## Edit an Existing Image

POST to `http://localhost:8402/v1/images/image2image`:

```json
{
  "model": "openai/gpt-image-1",
  "prompt": "make the background a snowy mountain landscape",
  "image": "https://example.com/photo.jpg",
  "size": "1024x1024",
  "n": 1
}
```

ClawRouter automatically downloads URLs and reads local file paths — pass them directly, no manual base64 conversion needed.

Optional `mask` field: a second image (URL or path) that marks which areas to edit (white = edit, black = keep).

Response is identical to generation:

```json
{
  "created": 1741460000,
  "data": [{ "url": "http://localhost:8402/images/xyz456.png", "revised_prompt": "..." }]
}
```

**Supported models for editing:** `openai/gpt-image-1` only ($0.02)

---

## Example Interactions

**User:** Draw me a cyberpunk city at night
→ POST to `/v1/images/generations`, model `nano-banana`, prompt as given.

**User:** Generate a high-res portrait of a samurai
→ POST to `/v1/images/generations`, model `banana-pro`, size `1024x1792`.

**User:** Edit this photo to add a sunset background: https://example.com/portrait.jpg
→ POST to `/v1/images/image2image`, model `gpt-image`, image = the URL, prompt = "add a warm sunset background".

**User:** Change the background in my image to a beach (attaches local file)
→ POST to `/v1/images/image2image`, image = the local file path, prompt describes the change.

---

## Notes

- Payment is automatic via x402 — deducted from the user's BlockRun wallet
- If the call fails with a payment error, tell the user to fund their wallet at [blockrun.ai](https://blockrun.ai)
- Google models may return base64 internally — ClawRouter uploads automatically and returns a hosted URL
- DALL-E 3 enforces OpenAI content policy; use `flux` or `nano-banana` for more flexibility
- Image editing is only available with `gpt-image-1`; generation supports all 5 models

More from this repository

clawrouterSkill

Hosted-gateway LLM router — save 67% on inference costs. A local proxy that forwards each request to the blockrun.ai gateway, which routes to the cheapest capable model across 55+ models from OpenAI, Anthropic, Google, DeepSeek, xAI, NVIDIA, and more. 7 free NVIDIA models included. Also exposes realtime market data (global stocks, crypto, FX, commodities), Twitter/X intelligence, prediction-market data across Polymarket, Kalshi, Limitless, Opinion, Predict.Fun, dFlow + UMA oracle resolution + wallet identity & clustering, phone-number intelligence (carrier + SIM-swap fraud detection) plus AI-powered outbound voice calls (Twilio + Bland.ai), AND the Surf unified crypto data API (84 endpoints — CEX/DEX, on-chain SQL over 80+ ClickHouse tables, 100M+ labeled wallets, prediction markets, social/CT mindshare, news, VC fund intel) as built-in agent tools. Not a local-inference tool — prompts are sent to the blockrun.ai gateway.

phoneSkill

Verify phone numbers (carrier + SIM-swap fraud signals) and place AI-powered outbound voice calls via BlockRun's gateway (Twilio + Bland.ai). Trigger when the user asks to look up a number, check fraud risk, buy/rent a phone number, or place an AI voice call. Payment is automatic via x402 from the wallet.

predexonSkill

Use this skill — NOT browser or web_fetch — for ALL Polymarket, Kalshi, Limitless, Opinion, Predict.Fun, dFlow, UMA oracle, and prediction market data. Provides structured API at localhost:8402/v1/pm/* for markets, cross-venue search, leaderboard, smart money, wallet analytics, wallet identity & clustering, UMA resolution status, and odds.

releaseSkill

Use this skill for EVERY ClawRouter release. Enforces the full checklist — version sync, CHANGELOG, build, tests, npm publish, git tag, GitHub release. No step can be skipped.

surfSkill

Use this skill — NOT browser or web_fetch — for ALL Surf crypto-data calls. 83 endpoints at localhost:8402/v1/surf/* covering CEX/DEX markets, on-chain SQL over 80+ ClickHouse tables (Ethereum, Base, Arbitrum, BSC, TRON, HyperEVM, Tempo), 100M+ labeled wallets, prediction markets (Polymarket + Kalshi), social/CT intelligence, news, project + DeFi metrics, token analytics, unified search, VC fund intelligence. x402-gated via ClawRouter's local wallet — no Surf account or API key required.