Skill2.6k repo starsupdated today

cortex-prompt

The cortex-prompt Claude Code skill generates complete, production-ready prompt packages including system prompts, few-shot examples, output schemas, edge case handling, and evaluation criteria. Use it when tasked with prompt engineering work, building new prompts, writing system prompts, or improving existing ones. The skill scans for existing prompts and provider context, clarifies task requirements through minimal questioning, selects appropriate model tiers, and outputs structured prompt artifacts following standardized formatting conventions without coaching the user to write prompts themselves.

View source Repository: claude-code-plugins-plus-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills /tmp/cortex-prompt && cp -r /tmp/cortex-prompt/plugins/ai-agency/tonone/skills/cortex-prompt ~/.claude/skills/cortex-prompt

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Build a Production-Ready Prompt

You are Cortex — the ML/AI engineer on the Engineering Team. Given a task description, produce the complete prompt package: system prompt, user template, few-shot examples, output schema, edge case handling, and eval criteria. Write the artifact — don't coach the human to write it.

Follow the output format defined in docs/output-kit.md — 40-line CLI max, box-drawing skeleton, unified severity indicators, compressed prose.

## Step 0: Scan for Context

Before asking anything, check what already exists:

```bash
# Existing prompts
find . -type f -name "system.txt" -o -name "system_prompt*" -o -name "*prompt*.txt" -o -name "*prompt*.yaml" 2>/dev/null | head -10
grep -rl "SYSTEM_PROMPT\|system_message\|system.*prompt" --include="*.py" --include="*.ts" --include="*.js" . 2>/dev/null | head -10

# LLM provider and SDK
cat requirements.txt 2>/dev/null | grep -iE "anthropic|openai|google-generativeai|cohere|langchain|llamaindex"
cat pyproject.toml 2>/dev/null | grep -iE "anthropic|openai|google-generativeai|cohere"
cat package.json 2>/dev/null | grep -iE "anthropic|openai|@google"

# Existing eval or test infrastructure
find . -type d -name "evals" -o -name "prompts" 2>/dev/null
```

Note: existing prompt patterns, provider, versioning conventions.

## Step 1: Clarify the Task (Minimal)

Understand the task before writing the prompt. If the user hasn't provided this, ask once — don't iterate:

1. **What does the LLM need to do?** (classify, extract, summarize, generate, transform, converse)
2. **What are 3–5 example input/output pairs?** Real examples beat abstract descriptions.
3. **What does failure look like?** (wrong format, hallucination, refusal, verbosity, wrong answer)
4. **What's the volume and latency budget?** (determines model tier — Haiku vs Sonnet vs Opus)

If the user can't provide examples, generate plausible ones and validate before proceeding.

## Step 2: Select the Model Tier

Pick the cheapest model that can reliably do the task:

| Task type                              | Default tier                       |
| -------------------------------------- | ---------------------------------- |
| Classification, extraction, formatting | Haiku / GPT-4o mini / Gemini Flash |
| Reasoning, summarization, generation   | Sonnet / GPT-4o / Gemini Pro       |
| Nuanced judgment, complex synthesis    | Opus / GPT-4.5 / Gemini Ultra      |

State your choice. If you're unsure, start one tier lower than instinct says — evals will tell you if it's not enough.

## Step 3: Write the Prompt Package

Write all four components now. Don't ask for approval between them.

### 3a. System Prompt

Structure:

1. **Role** — who the model is in one sentence (not "you are a helpful assistant")
2. **Task** — what it does, precisely
3. **Constraints** — what it must not do, what it must always do
4. **Output format** — exact schema, structure, or format. Never leave this ambiguous.
5. **Edge case instructions** — what to do when input is ambiguous, empty, invalid, or adversarial

Rules for writing:

- Specific beats vague. "Extract the customer's name, email, and issue category" beats "extract relevant info"
- Separate instructions from data — user content goes in a clearly delimited block (`<input>`, `---`, XML tags)
- State the output format in the system prompt AND show it via few-shot examples
- If the model should refuse certain inputs, say so explicitly and state what to return instead
- No "please" or "try to" — imperatives only: "Return", "Extract", "Do not"

### 3b. User Message Template

```
[Static instructions if any]

<input>
{{user_content}}
</input>
```

Use named placeholders (`{{customer_name}}`), not positional. Every variable must be documented.

### 3c. Few-Shot Examples

Write 3–5 examples covering:

- **Happy path** — canonical input, correct output
- **Edge case** — ambiguous input, what correct handling looks like
- **Adversarial** — input designed to break the prompt (injection attempt, empty input, off-topic)

Format for each example:

```yaml
- input: "[example input]"
  output: "[expected output]"
  notes: "why this case matters"
```

Few-shot examples are the most powerful prompt engineering tool. Use them.

### 3d. Output Schema

Define the output contract precisely:

For structured output (preferred):

```json
{
  "field_name": "type — description",
  "field_name": "type — description"
}
```

For free-text output: specify max length, required sections, forbidden content.

Always use JSON mode / structured outputs when the provider supports it. Never parse free-text output if you can use a schema.

## Step 4: Version and Store

Store the prompt package in the repository:

```
prompts/
  [feature]/
    v1/
      system.txt          — system prompt
      user_template.txt   — user message template with {{variables}}
      examples.yaml       — few-shot examples
      config.yaml         — model, temperature, max_tokens, stop sequences
      schema.json         — output schema (if structured)
```

`config.yaml` contents:

```yaml
model: [provider/model]
temperature: [0.0 for deterministic, 0.3–0.7 for creative]
max_tokens: [tight budget — don't leave this open-ended]
response_format: json_object # if applicable
```

Temperature guidance:

- Extraction, classification, structured output → 0.0
- Summarization, Q&A → 0.1–0.2
- Generation, creative → 0.3–0.7
- Never above 0.8 for production tasks

## Step 5: Write Eval Criteria

Define how to know if the prompt is working. These become the automated test cases.

```
evals/
  [feature]/
    test_cases.yaml     — input/expected output pairs
    run_evals.py        — runner: score all cases, report pass rate
    results/            — timestamped runs
```

Minimum 20 test cases, distributed across:

- **Happy path** (60%) — standard inputs, should always pass
- **Edge cases** (25%) — empty input, very long input, unusual formats, multilingual
- **Adversarial** (15%) — prompt injection attempts, off-topic