Skip to main content
ClaudeWave
Skill1.6k repo starsupdated 1mo ago

gpt-image-skill

This skill generates or edits images using OpenAI's GPT Image models through a Python script that requires an OPENAI_API_KEY and dependencies like openai and Pillow. Use this skill when users request image generation or editing and mention OpenAI, GPT, or when an OpenAI API key is available, specifying details like desired size, model version, quality level, and output filename.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/feiskyer/claude-code-settings /tmp/gpt-image-skill && cp -r /tmp/gpt-image-skill/skills/gpt-image-skill ~/.claude/skills/gpt-image-skill
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# GPT Image Skill

Generate or edit images using OpenAI's GPT Image models through a bundled Python script.

## Requirements

1. **OPENAI_API_KEY**: Must be configured in `~/.gpt-image.env` or `export OPENAI_API_KEY=<your-key>`
2. **OPENAI_API_BASE** (optional): Custom API base URL for compatible endpoints (e.g. Azure OpenAI, proxies). Set in `~/.gpt-image.env` or export it.
3. **Python3 with dependencies**: openai, Pillow. Install via `python3 -m pip install -r ./requirements.txt` if not installed yet.
4. **Executable**: `./gpt_image.py`

## Instructions

### For image generation

1. Ask the user for:
   - What they want to create (the prompt)
   - Desired size (optional, defaults to 1024x1024)
   - Output filename (optional, auto-generates UUID-based name if not specified)
   - Model preference (optional, defaults to gpt-image-2)
   - Quality (optional, defaults to auto)
   - Number of images (optional, defaults to 1)

2. Run the script:

   ```bash
   python3 ./gpt_image.py --prompt "description of image" --output "filename.png"
   ```

3. Show the user the saved image path when complete.

### For image editing

1. Ask the user for:
   - Input image file(s) to edit (up to 3)
   - What changes they want (the prompt)
   - Output filename (optional)

2. Run with input images:

   ```bash
   python3 ./gpt_image.py edit --prompt "editing instructions" --input image1.png image2.png --output "edited.png"
   ```

## Available Options

### Models (--model)

- `gpt-image-2` (default) — Latest model with strong instruction following, text rendering, and broad world knowledge
- `gpt-image-1.5` — Mid-tier model
- `gpt-image-1` — First-generation GPT image model
- `gpt-image-1-mini` — Lightweight, faster generation

### Sizes (--size)

- `1024x1024` (default) — Square
- `1024x1536` — Portrait (2:3)
- `1536x1024` — Landscape (3:2)
- `auto` — Let the model decide

### Quality (--quality)

- `auto` (default) — Model decides optimal quality
- `high` — Higher detail, slower
- `medium` — Balanced
- `low` — Fastest

### Output Format (--format)

- `png` (default) — Lossless
- `jpeg` — Smaller file size
- `webp` — Modern format, good compression

### Background (--background)

- `auto` (default) — Model decides
- `transparent` — Transparent background (png/webp only)
- `opaque` — Solid background

### Other Options

- `--n <count>` — Number of images to generate (default: 1)
- `--output <filename>` — Output filename (default: auto-generated)

## Examples

### Generate a simple image

```bash
python3 ./gpt_image.py --prompt "A serene mountain landscape at sunset with a lake"
```

### Generate with specific size and output

```bash
python3 ./gpt_image.py \
  --prompt "Modern minimalist logo for a tech startup" \
  --size 1024x1024 \
  --quality high \
  --output "logo.png"
```

### Generate landscape image

```bash
python3 ./gpt_image.py \
  --prompt "Futuristic cityscape with flying cars" \
  --size 1536x1024 \
  --output "cityscape.png"
```

### Generate with transparent background

```bash
python3 ./gpt_image.py \
  --prompt "A cute cartoon cat mascot" \
  --background transparent \
  --format png \
  --output "mascot.png"
```

### Generate multiple images

```bash
python3 ./gpt_image.py \
  --prompt "Abstract art in the style of Kandinsky" \
  --n 3 \
  --output "art.png"
```

### Edit existing images

```bash
python3 ./gpt_image.py edit \
  --prompt "Add a rainbow in the sky" \
  --input photo.png \
  --output "photo-with-rainbow.png"
```

### Combine multiple reference images

```bash
python3 ./gpt_image.py edit \
  --prompt "Create a gift basket containing all items shown" \
  --input item1.png item2.png item3.png \
  --output "gift-basket.png"
```

### Use a different model

```bash
python3 ./gpt_image.py \
  --prompt "Detailed portrait of a cat in watercolor style" \
  --model gpt-image-1 \
  --output "cat-portrait.png"
```

## Error Handling

If the script fails:

- Check that `OPENAI_API_KEY` is exported
- If using a custom endpoint, verify `OPENAI_API_BASE` is correct
- Verify input image files exist and are readable (for editing)
- Ensure the output directory is writable
- Check that the model name is valid

## Best Practices

1. Be descriptive in prompts — include style, mood, colors, composition details
2. For logos/icons, use square size (1024x1024) with transparent background
3. For social media, use portrait (1024x1536) for stories or square for posts
4. For wallpapers/headers, use landscape (1536x1024)
5. Use `high` quality for final output, `auto` for quick iterations
6. GPT Image models excel at text rendering — include text in prompts when needed
7. For editing, provide clear instructions about what to change and what to keep
command-creatorSkill

Create Claude Code custom slash commands with proper structure, frontmatter, and best practices. Use this skill whenever the user wants to create a new command, add a slash command, build a custom command, or mentions "create-command", "new command", "add command", or "make a command" for Claude Code. Also trigger when the user wants to turn a workflow into a reusable command.

deep-reflectorSubagent

Comprehensive session analysis and learning capture specialist. Analyzes development sessions to extract patterns, preferences, and improvements for future interactions. Use after significant work sessions to capture learnings.

github-issue-fixerSubagent

GitHub issue resolution specialist. Analyzes, plans, and implements fixes for GitHub issues with proper testing and PR creation. Use when fixing specific GitHub issues.

insight-documenterSubagent

Technical breakthrough documentation specialist. Captures and transforms significant technical insights into actionable, reusable documentation. Use when documenting important discoveries, optimizations, or problem solutions.

instruction-reflectorSubagent

Analyzes and improves Claude Code instructions in CLAUDE.md. Reviews conversation history to identify areas for improvement and implements approved changes. Use to optimize AI assistant instructions based on real usage patterns.

pr-reviewerSubagent

Expert code reviewer for GitHub pull requests. Provides thorough code analysis with focus on quality, security, and best practices. Use when reviewing PRs for code quality and potential issues.

ui-engineerSubagent

Expert UI/frontend developer for creating, modifying, or reviewing frontend code, UI components, and user interfaces. Use when building React components, responsive designs, or any frontend development tasks. PROACTIVELY use for UI/UX implementation, component architecture, and frontend best practices.

autonomous-skillSkill

>-