Skip to main content
ClaudeWave
Skill5.7k estrellas del repoactualizado yesterday

fuzzing-dictionary

# ClaudeWave Editor Notes **fuzzing-dictionary** This skill teaches how to create and apply fuzzing dictionaries, which are curated token files that guide software fuzzers toward semantically meaningful inputs like protocol keywords, magic numbers, and format-specific strings. Use this technique when fuzzing parsers, protocol handlers, and file format processors where coverage plateaus early, since dictionaries help fuzzers reach deeper code paths that purely random mutation cannot easily discover.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/trailofbits/skills /tmp/fuzzing-dictionary && cp -r /tmp/fuzzing-dictionary/plugins/testing-handbook-skills/skills/fuzzing-dictionary ~/.claude/skills/fuzzing-dictionary
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Fuzzing Dictionary

A fuzzing dictionary provides domain-specific tokens to guide the fuzzer toward interesting inputs. Instead of purely random mutations, the fuzzer incorporates known keywords, magic numbers, protocol commands, and format-specific strings that are more likely to reach deeper code paths in parsers, protocol handlers, and file format processors.

## Overview

Dictionaries are text files containing quoted strings that represent meaningful tokens for your target. They help fuzzers bypass early validation checks and explore code paths that would be difficult to reach through blind mutation alone.

### Key Concepts

| Concept | Description |
|---------|-------------|
| **Dictionary Entry** | A quoted string (e.g., `"keyword"`) or key-value pair (e.g., `kw="value"`) |
| **Hex Escapes** | Byte sequences like `"\xF7\xF8"` for non-printable characters |
| **Token Injection** | Fuzzer inserts dictionary entries into generated inputs |
| **Cross-Fuzzer Format** | Dictionary files work with libFuzzer, AFL++, and cargo-fuzz |

## When to Apply

**Apply this technique when:**
- Fuzzing parsers (JSON, XML, config files)
- Fuzzing protocol implementations (HTTP, DNS, custom protocols)
- Fuzzing file format handlers (PNG, PDF, media codecs)
- Coverage plateaus early without reaching deeper logic
- Target code checks for specific keywords or magic values

**Skip this technique when:**
- Fuzzing pure algorithms without format expectations
- Target has no keyword-based parsing
- Corpus already achieves high coverage

## Quick Reference

| Task | Command/Pattern |
|------|-----------------|
| Use with libFuzzer | `./fuzz -dict=./dictionary.dict ...` |
| Use with AFL++ | `afl-fuzz -x ./dictionary.dict ...` |
| Use with cargo-fuzz | `cargo fuzz run fuzz_target -- -dict=./dictionary.dict` |
| Extract from header | `grep -o '".*"' header.h > header.dict` |
| Generate from binary | `strings ./binary \| sed 's/^/"&/; s/$/&"/' > strings.dict` |

## Step-by-Step

### Step 1: Create Dictionary File

Create a text file with quoted strings on each line. Use comments (`#`) for documentation.

**Example dictionary format:**

```conf
# Lines starting with '#' and empty lines are ignored.

# Adds "blah" (w/o quotes) to the dictionary.
kw1="blah"
# Use \\ for backslash and \" for quotes.
kw2="\"ac\\dc\""
# Use \xAB for hex values
kw3="\xF7\xF8"
# the name of the keyword followed by '=' may be omitted:
"foo\x0Abar"
```

### Step 2: Generate Dictionary Content

Choose a generation method based on what's available:

**From LLM:** Prompt ChatGPT or Claude with:
```text
A dictionary can be used to guide the fuzzer. Write me a dictionary file for fuzzing a <PNG parser>. Each line should be a quoted string or key-value pair like kw="value". Include magic bytes, chunk types, and common header values. Use hex escapes like "\xF7\xF8" for binary values.
```

**From header files:**
```bash
grep -o '".*"' header.h > header.dict
```

**From man pages (for CLI tools):**
```bash
man curl | grep -oP '^\s*(--|-)\K\S+' | sed 's/[,.]$//' | sed 's/^/"&/; s/$/&"/' | sort -u > man.dict
```

**From binary strings:**
```bash
strings ./binary | sed 's/^/"&/; s/$/&"/' > strings.dict
```

### Step 3: Pass Dictionary to Fuzzer

Use the appropriate flag for your fuzzer (see Quick Reference above).

## Common Patterns

### Pattern: Protocol Keywords

**Use Case:** Fuzzing HTTP or custom protocol handlers

**Dictionary content:**
```conf
# HTTP methods
"GET"
"POST"
"PUT"
"DELETE"
"HEAD"

# Headers
"Content-Type"
"Authorization"
"Host"

# Protocol markers
"HTTP/1.1"
"HTTP/2.0"
```

### Pattern: Magic Bytes and File Format Headers

**Use Case:** Fuzzing image parsers, media decoders, archive handlers

**Dictionary content:**
```conf
# PNG magic bytes and chunks
png_magic="\x89PNG\r\n\x1a\n"
ihdr="IHDR"
plte="PLTE"
idat="IDAT"
iend="IEND"

# JPEG markers
jpeg_soi="\xFF\xD8"
jpeg_eoi="\xFF\xD9"
```

### Pattern: Configuration File Keywords

**Use Case:** Fuzzing config file parsers (YAML, TOML, INI)

**Dictionary content:**
```conf
# Common config keywords
"true"
"false"
"null"
"version"
"enabled"
"disabled"

# Section headers
"[general]"
"[network]"
"[security]"
```

## Advanced Usage

### Tips and Tricks

| Tip | Why It Helps |
|-----|--------------|
| Combine multiple generation methods | LLM-generated keywords + strings from binary covers broad surface |
| Include boundary values | `"0"`, `"-1"`, `"2147483647"` trigger edge cases |
| Add format delimiters | `:`, `=`, `{`, `}` help fuzzer construct valid structures |
| Keep dictionaries focused | 50-200 entries perform better than thousands |
| Test dictionary effectiveness | Run with and without dict, compare coverage |

### Auto-Generated Dictionaries (AFL++)

When using `afl-clang-lto` compiler, AFL++ automatically extracts dictionary entries from string comparisons in the binary. This happens at compile time via the AUTODICTIONARY feature.

**Enable auto-dictionary:**
```bash
export AFL_LLVM_DICT2FILE=auto.dict
afl-clang-lto++ target.cc -o target
# Dictionary saved to auto.dict
afl-fuzz -x auto.dict -i in -o out -- ./target
```

### Combining Multiple Dictionaries

Some fuzzers support multiple dictionary files:

```bash
# AFL++ with multiple dictionaries
afl-fuzz -x keywords.dict -x formats.dict -i in -o out -- ./target
```

## Anti-Patterns

| Anti-Pattern | Problem | Correct Approach |
|--------------|---------|------------------|
| Including full sentences | Fuzzer needs atomic tokens, not prose | Break into individual keywords |
| Duplicating entries | Wastes mutation budget | Use `sort -u` to deduplicate |
| Over-sized dictionaries | Slows fuzzer, dilutes useful tokens | Keep focused: 50-200 most relevant entries |
| Missing hex escapes | Non-printable bytes become mangled | Use `\xXX` for binary values |
| No comments | Hard to maintain and audit | Document sections with `#` comments |

## Tool-Specific Guidance

### libFuzzer

```bash
clang++ -fsanitize=fuzzer,address harne
agentic-actions-auditorSkill

Audits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.

ask-questions-if-underspecifiedSkill

Clarify requirements before implementing. Use when serious doubts arise.

audit-context-buildingSkill

Enables ultra-granular, line-by-line code analysis to build deep architectural context before vulnerability or bug finding.

algorand-vulnerability-scannerSkill

Scans Algorand smart contracts for 11 common vulnerabilities including rekeying attacks, unchecked transaction fees, missing field validations, and access control issues. Use when auditing Algorand projects (TEAL/PyTeal).

audit-prep-assistantSkill

Prepares codebases for security review using Trail of Bits' checklist. Helps set review goals, runs static analysis tools, increases test coverage, removes dead code, ensures accessibility, and generates documentation (flowcharts, user stories, inline comments).

cairo-vulnerability-scannerSkill

Scans Cairo/StarkNet smart contracts for 6 critical vulnerabilities including felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, and signature replay. Use when auditing StarkNet projects.

code-maturity-assessorSkill

Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.

cosmos-vulnerability-scannerSkill

Scans Cosmos SDK blockchain modules and CosmWasm contracts for consensus-critical vulnerabilities — chain halts, fund loss, state divergence. 25 core + 16 IBC + 10 EVM + 3 CosmWasm patterns. Use when auditing custom x/ modules, reviewing IBC integrations, or assessing pre-launch chain security. Updated for SDK v0.53.x.