designing-workflow-skills
This Claude Code skill teaches developers how to build reliable workflow-based skills by establishing structural patterns rather than prose instructions. Use this skill when designing new Claude skills that need consistent execution, proper phase sequencing, correct tool matching, and scalable tool-calling patterns. It emphasizes making descriptions trigger-focused, organizing work into numbered phases with clear entry and exit criteria, and separating detailed patterns into references and workflows to keep the main skill file concise and maintainable.
git clone --depth 1 https://github.com/trailofbits/skills /tmp/designing-workflow-skills && cp -r /tmp/designing-workflow-skills/plugins/workflow-skill-design/skills/designing-workflow-skills ~/.claude/skills/designing-workflow-skillsSKILL.md
# Designing Workflow Skills
Build workflow-based skills that execute reliably by following structural patterns, not prose.
## Essential Principles
<essential_principles>
<principle name="description-is-the-trigger">
**The `description` field is the only thing that controls when a skill activates.**
Claude decides whether to load a skill based solely on its frontmatter `description`. The body of SKILL.md — including "When to Use" and "When NOT to Use" sections — is only read AFTER the skill is already active. Put your trigger keywords, use cases, and exclusions in the description. A bad description means wrong activations or missed activations regardless of what the body says.
"When to Use" and "When NOT to Use" sections still serve a purpose: they scope the LLM's behavior once active. "When NOT to Use" should name specific alternatives: "use Semgrep for simple pattern matching" not "not for simple tasks."
</principle>
<principle name="numbered-phases">
**Phases must be numbered with entry and exit criteria.**
Unnumbered prose instructions produce unreliable execution order. Every phase needs:
- A number (Phase 1, Phase 2, ...)
- Entry criteria (what must be true before starting)
- Numbered actions (what to do)
- Exit criteria (how to know it's done)
</principle>
<principle name="tools-match-executor">
**Tools must match the executor.**
Skills use `allowed-tools:` in frontmatter. Agents use `tools:` in frontmatter. Subagents get tools from their `subagent_type`. Never list tools the component doesn't use. Never use Bash for operations that have dedicated tools (Glob, Grep, Read, Write, Edit).
Most skills and agents should include `TodoRead` and `TodoWrite` in their tool list — these enable progress tracking during multi-step execution and are useful even for skills that don't explicitly manage tasks.
</principle>
<principle name="progressive-disclosure">
**Progressive disclosure is structural, not optional.**
SKILL.md stays under 500 lines. It contains only what the LLM needs for every invocation: principles, routing, quick references, and links. Detailed patterns go in `references/`. Step-by-step processes go in `workflows/`. One level deep — no reference chains.
</principle>
<principle name="scalable-tool-patterns">
**Instructions must produce tool-calling patterns that scale.**
Every workflow instruction becomes tool calls at runtime. If a workflow searches N files for M patterns, combine into one regex — not N×M calls. If a workflow spawns subagents per item, use batching — not one subagent per file. Apply the 10,000-file test: mentally run the workflow against a large repo and check that tool call count stays bounded. See [anti-patterns.md](references/anti-patterns.md) AP-18 and AP-19.
</principle>
<principle name="degrees-of-freedom">
**Match instruction specificity to task fragility.**
Not every step needs the same level of prescription. Calibrate per step:
- **Low freedom** (exact commands, no variation): Fragile operations — database migrations, crypto, destructive actions. "Run exactly this script."
- **Medium freedom** (pseudocode with parameters): Preferred patterns where variation is acceptable. "Use this template and customize as needed."
- **High freedom** (heuristics and judgment): Variable tasks — code review, exploration, documentation. "Analyze the structure and suggest improvements."
A skill can mix freedom levels. A security audit skill might use high freedom for the discovery phase ("explore the codebase for auth patterns") and low freedom for the reporting phase ("use exactly this severity classification table").
</principle>
</essential_principles>
## When to Use
- Designing a new skill with multi-step workflows or phased execution
- Creating a skill that routes between multiple independent tasks
- Building a skill with safety gates (destructive actions requiring confirmation)
- Structuring a skill that uses subagents or task tracking
- Reviewing or refactoring an existing workflow skill for quality
- Deciding how to split content between SKILL.md, references/, and workflows/
## When NOT to Use
- Simple single-purpose skills with no workflow (just guidance) — write the SKILL.md directly
- Writing the actual domain content of a skill (this teaches structure, not domain expertise)
- Plugin configuration (plugin.json, hooks, commands) — use plugin development guides
- Non-skill Claude Code development — this is specifically for skill architecture
## Pattern Selection
Choose the right pattern for your skill's structure. Read the full pattern description in [workflow-patterns.md](references/workflow-patterns.md).
```
How many distinct paths does the skill have?
|
+-- One path, always the same
| +-- Does it perform destructive actions?
| +-- YES -> Safety Gate Pattern
| +-- NO -> Linear Progression Pattern
|
+-- Multiple independent paths from shared setup
| +-- Routing Pattern
|
+-- Multiple dependent steps in sequence
+-- Do steps have complex dependencies?
+-- YES -> Task-Driven Pattern
+-- NO -> Sequential Pipeline Pattern
```
### Pattern Summary
| Pattern | Use When | Key Feature |
|---------|----------|-------------|
| **Routing** | Multiple independent tasks from shared intake | Routing table maps intent to workflow files |
| **Sequential Pipeline** | Dependent steps, each feeding the next | Auto-detection may resume from partial progress |
| **Linear Progression** | Single path, same every time | Numbered phases with entry/exit criteria |
| **Safety Gate** | Destructive/irreversible actions | Two confirmation gates before execution |
| **Task-Driven** | Complex dependencies, partial failure tolerance | TaskCreate/TaskUpdate with dependency tracking |
## Structural Anatomy
Every workflow skill needs this skeleton, regardless of pattern:
```markdown
---
name: kebab-case-name
description: "Third-person description with trigger keywords — this is how Claude decides to activate the skill"
allowed-toAudits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.
Clarify requirements before implementing. Use when serious doubts arise.
Enables ultra-granular, line-by-line code analysis to build deep architectural context before vulnerability or bug finding.
Scans Algorand smart contracts for 11 common vulnerabilities including rekeying attacks, unchecked transaction fees, missing field validations, and access control issues. Use when auditing Algorand projects (TEAL/PyTeal).
Prepares codebases for security review using Trail of Bits' checklist. Helps set review goals, runs static analysis tools, increases test coverage, removes dead code, ensures accessibility, and generates documentation (flowcharts, user stories, inline comments).
Scans Cairo/StarkNet smart contracts for 6 critical vulnerabilities including felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, and signature replay. Use when auditing StarkNet projects.
Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.
Scans Cosmos SDK blockchain modules and CosmWasm contracts for consensus-critical vulnerabilities — chain halts, fund loss, state divergence. 25 core + 16 IBC + 10 EVM + 3 CosmWasm patterns. Use when auditing custom x/ modules, reviewing IBC integrations, or assessing pre-launch chain security. Updated for SDK v0.53.x.