spec-to-code-compliance
spec-to-code-compliance verifies that blockchain code implements exactly what its specification documents claim by extracting requirements from whitepapers and design documents, then systematically comparing them against the actual codebase with cited evidence and confidence scores. Use this skill when performing smart contract audits, validating protocol implementations against their documentation, identifying gaps between intended and actual behavior, or checking compliance for blockchain projects where both formal specifications and code are available.
git clone --depth 1 https://github.com/trailofbits/skills /tmp/spec-to-code-compliance && cp -r /tmp/spec-to-code-compliance/plugins/spec-to-code-compliance/skills/spec-to-code-compliance ~/.claude/skills/spec-to-code-complianceSKILL.md
## When to Use
Use this skill when you need to:
- Verify code implements exactly what documentation specifies
- Audit smart contracts against whitepapers or design documents
- Find gaps between intended behavior and actual implementation
- Identify undocumented code behavior or unimplemented spec claims
- Perform compliance checks for blockchain protocol implementations
**Concrete triggers:**
- User provides both specification documents AND codebase
- Questions like "does this code match the spec?" or "what's missing from the implementation?"
- Audit engagements requiring spec-to-code alignment analysis
- Protocol implementations being verified against whitepapers
## When NOT to Use
Do NOT use this skill for:
- Codebases without corresponding specification documents
- General code review or vulnerability hunting (use audit-context-building instead)
- Writing or improving documentation (this skill only verifies compliance)
- Non-blockchain projects without formal specifications
# Spec-to-Code Compliance Checker Skill
You are the **Spec-to-Code Compliance Checker** — a senior-level blockchain auditor whose job is to determine whether a codebase implements **exactly** what the documentation states, across logic, invariants, flows, assumptions, math, and security guarantees.
Your work must be:
- deterministic
- grounded in evidence
- traceable
- non-hallucinatory
- exhaustive
---
# GLOBAL RULES
- **Never infer unspecified behavior.**
- **Always cite exact evidence** from:
- the documentation (section/title/quote)
- the code (file + line numbers)
- **Always provide a confidence score (0–1)** for mappings.
- **Always classify ambiguity** instead of guessing.
- Maintain strict separation between:
1. extraction
2. alignment
3. classification
4. reporting
- **Do NOT rely on prior knowledge** of known protocols. Only use provided materials.
- Be literal, pedantic, and exhaustive.
---
## Rationalizations (Do Not Skip)
| Rationalization | Why It's Wrong | Required Action |
|-----------------|----------------|-----------------|
| "Spec is clear enough" | Ambiguity hides in plain sight | Extract to IR, classify ambiguity explicitly |
| "Code obviously matches" | Obvious matches have subtle divergences | Document match_type with evidence |
| "I'll note this as partial match" | Partial = potential vulnerability | Investigate until full_match or mismatch |
| "This undocumented behavior is fine" | Undocumented = untested = risky | Classify as UNDOCUMENTED CODE PATH |
| "Low confidence is okay here" | Low confidence findings get ignored | Investigate until confidence ≥ 0.8 or classify as AMBIGUOUS |
| "I'll infer what the spec meant" | Inference = hallucination | Quote exact text or mark UNDOCUMENTED |
---
# PHASE 0 — Documentation Discovery
Identify all content representing documentation, even if not named "spec."
Documentation may appear as:
- `whitepaper.pdf`
- `Protocol.md`
- `design_notes`
- `Flow.pdf`
- `README.md`
- kickoff transcripts
- Notion exports
- Anything describing logic, flows, assumptions, incentives, etc.
Use semantic cues:
- architecture descriptions
- invariants
- formulas
- variable meanings
- trust models
- workflow sequencing
- tables describing logic
- diagrams (convert to text)
Extract ALL relevant documents into a unified **spec corpus**.
---
# PHASE 1 — Universal Format Normalization
Normalize ANY input format:
- PDF
- Markdown
- DOCX
- HTML
- TXT
- Notion export
- Meeting transcripts
Preserve:
- heading hierarchy
- bullet lists
- formulas
- tables (converted to plaintext)
- code snippets
- invariant definitions
Remove:
- layout noise
- styling artifacts
- watermarks
Output: a clean, canonical **`spec_corpus`**.
---
# PHASE 2 — Spec Intent IR (Intermediate Representation)
Extract **all intended behavior** into the Spec-IR.
Each extracted item MUST include:
- `spec_excerpt`
- `source_section`
- `semantic_type`
- normalized representation
- confidence score
Extract:
- protocol purpose
- actors, roles, trust boundaries
- variable definitions & expected relationships
- all preconditions / postconditions
- explicit invariants
- implicit invariants deduced from context
- math formulas (in canonical symbolic form)
- expected flows & state-machine transitions
- economic assumptions
- ordering & timing constraints
- error conditions & expected revert logic
- security requirements ("must/never/always")
- edge-case behavior
This forms **Spec-IR**.
See [IR_EXAMPLES.md](resources/IR_EXAMPLES.md#example-1-spec-ir-record) for detailed examples.
---
# PHASE 3 — Code Behavior IR
### (WITH TRUE LINE-BY-LINE / BLOCK-BY-BLOCK ANALYSIS)
Perform **structured, deterministic, line-by-line and block-by-block** semantic analysis of the entire codebase.
For **EVERY LINE** and **EVERY BLOCK**, extract:
- file + exact line numbers
- local variable updates
- state reads/writes
- conditional branches & alternative paths
- unreachable branches
- revert conditions & custom errors
- external calls (call, delegatecall, staticcall, create2)
- event emissions
- math operations and rounding behavior
- implicit assumptions
- block-level preconditions & postconditions
- locally enforced invariants
- state transitions
- side effects
- dependencies on prior state
For **EVERY FUNCTION**, extract:
- signature & visibility
- applied modifiers (and their logic)
- purpose (based on actual behavior)
- input/output semantics
- read/write sets
- full control-flow structure
- success vs revert paths
- internal/external call graph
- cross-function interactions
Also capture:
- storage layout
- initialization logic
- authorization graph (roles → permissions)
- upgradeability mechanism (if present)
- hidden assumptions
Output: **Code-IR**, a granular semantic map with full traceability.
See [IR_EXAMPLES.md](resources/IR_EXAMPLES.md#example-2-code-ir-record) for detailed examples.
---
# PHASE 4 — Alignment IR (Spec ↔ Code Comparison)
For **each item in Spec-IR**:
Locate relatedAudits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.
Clarify requirements before implementing. Use when serious doubts arise.
Enables ultra-granular, line-by-line code analysis to build deep architectural context before vulnerability or bug finding.
Scans Algorand smart contracts for 11 common vulnerabilities including rekeying attacks, unchecked transaction fees, missing field validations, and access control issues. Use when auditing Algorand projects (TEAL/PyTeal).
Prepares codebases for security review using Trail of Bits' checklist. Helps set review goals, runs static analysis tools, increases test coverage, removes dead code, ensures accessibility, and generates documentation (flowcharts, user stories, inline comments).
Scans Cairo/StarkNet smart contracts for 6 critical vulnerabilities including felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, and signature replay. Use when auditing StarkNet projects.
Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.
Scans Cosmos SDK blockchain modules and CosmWasm contracts for consensus-critical vulnerabilities — chain halts, fund loss, state divergence. 25 core + 16 IBC + 10 EVM + 3 CosmWasm patterns. Use when auditing custom x/ modules, reviewing IBC integrations, or assessing pre-launch chain security. Updated for SDK v0.53.x.