audit-augmentation
Audit Augmentation imports security findings from external static analysis tools like Semgrep and CodeQL (in SARIF format) or human audit annotations (weAudit format) into Trailmark code graphs as annotations and subgraphs. Use this skill to overlay vulnerability findings onto code structure, cross-reference findings with taint analysis and blast radius data, and query which functions contain high-severity issues.
git clone --depth 1 https://github.com/trailofbits/skills /tmp/audit-augmentation && cp -r /tmp/audit-augmentation/plugins/trailmark/skills/audit-augmentation ~/.claude/skills/audit-augmentationSKILL.md
# Audit Augmentation
Projects findings from external tools (SARIF) and human auditors (weAudit)
onto Trailmark code graphs as annotations and subgraphs.
## When to Use
- Importing Semgrep, CodeQL, or other SARIF-producing tool results into a graph
- Importing weAudit audit annotations into a graph
- Cross-referencing static analysis findings with blast radius or taint data
- Querying which functions have high-severity findings
- Visualizing audit coverage alongside code structure
## When NOT to Use
- Running static analysis tools (use semgrep/codeql directly, then import)
- Building the code graph itself (use the `trailmark` skill)
- Generating diagrams (use the `diagramming-code` skill after augmenting)
## Rationalizations to Reject
| Rationalization | Why It's Wrong | Required Action |
|-----------------|----------------|-----------------|
| "The user only asked about SARIF, skip pre-analysis" | Without pre-analysis, you can't cross-reference findings with blast radius or taint | Always run `engine.preanalysis()` before augmenting |
| "Unmatched findings don't matter" | Unmatched findings may indicate parsing gaps or out-of-scope files | Report unmatched count and investigate if high |
| "One severity subgraph is enough" | Different severities need different triage workflows | Query all severity subgraphs, not just `error` |
| "SARIF results speak for themselves" | Findings without graph context lack blast radius and taint reachability | Cross-reference with pre-analysis subgraphs |
| "weAudit and SARIF overlap, pick one" | Human auditors and tools find different things | Import both when available |
| "Tool isn't installed, I'll do it manually" | Manual analysis misses what tooling catches | Install trailmark first |
---
## Installation
**MANDATORY:** If `uv run trailmark` fails, install trailmark first:
```bash
uv pip install trailmark
```
## Quick Start
### CLI
```bash
# Augment with SARIF
uv run trailmark augment {targetDir} --sarif results.sarif
# Augment with weAudit
uv run trailmark augment {targetDir} --weaudit .vscode/alice.weaudit
# Both at once, output JSON
uv run trailmark augment {targetDir} \
--sarif results.sarif \
--weaudit .vscode/alice.weaudit \
--json
```
### Programmatic API
```python
from trailmark.query.api import QueryEngine
engine = QueryEngine.from_directory("{targetDir}", language="auto")
# Run pre-analysis first for cross-referencing
engine.preanalysis()
# Augment with SARIF
result = engine.augment_sarif("results.sarif")
# result: {matched_findings: 12, unmatched_findings: 3, subgraphs_created: [...]}
# Augment with weAudit
result = engine.augment_weaudit(".vscode/alice.weaudit")
# Query findings
engine.findings() # All findings
engine.subgraph("sarif:error") # High-severity SARIF
engine.subgraph("weaudit:high") # High-severity weAudit
engine.subgraph("sarif:semgrep") # By tool name
engine.annotations_of("function_name") # Per-node lookup
```
If auto-detection is wrong for the target, rerun with an explicit language or
comma-separated list such as `python,rust`.
## Workflow
```
Augmentation Progress:
- [ ] Step 1: Build graph and run pre-analysis
- [ ] Step 2: Locate SARIF/weAudit files
- [ ] Step 3: Run augmentation
- [ ] Step 4: Inspect results and subgraphs
- [ ] Step 5: Cross-reference with pre-analysis
```
**Step 1:** Build the graph and run pre-analysis for blast radius and taint
context:
```python
engine = QueryEngine.from_directory("{targetDir}", language="auto")
engine.preanalysis()
```
If auto-detection is wrong for the target, rerun with an explicit language or
comma-separated list such as `python,rust`.
**Step 2:** Locate input files:
- **SARIF**: Usually output by tools like `semgrep --sarif -o results.sarif`
or `codeql database analyze --format=sarif-latest`
- **weAudit**: Stored in `.vscode/<username>.weaudit` within the workspace
**Step 3:** Run augmentation via `engine.augment_sarif()` or
`engine.augment_weaudit()`. Check `unmatched_findings` in the result — these
are findings whose file/line locations didn't overlap any parsed code unit.
**Step 4:** Query findings and subgraphs. Use `engine.findings()` to list all
annotated nodes. Use `engine.subgraph_names()` to see available subgraphs.
**Step 5:** Cross-reference with pre-analysis data to prioritize:
- Findings on tainted nodes: overlap `sarif:error` with `tainted` subgraph
- Findings on high blast radius nodes: overlap with `high_blast_radius`
- Findings on privilege boundaries: overlap with `privilege_boundary`
## Annotation Format
Findings are stored as standard Trailmark annotations:
- **Kind**: `finding` (tool-generated) or `audit_note` (human notes)
- **Source**: `sarif:<tool_name>` or `weaudit:<author>`
- **Description**: Compact single-line:
`[SEVERITY] rule-id: message (tool)`
## Subgraphs Created
| Subgraph | Contents |
|----------|----------|
| `sarif:error` | Nodes with SARIF error-level findings |
| `sarif:warning` | Nodes with SARIF warning-level findings |
| `sarif:note` | Nodes with SARIF note-level findings |
| `sarif:<tool>` | Nodes flagged by a specific tool |
| `weaudit:high` | Nodes with high-severity weAudit findings |
| `weaudit:medium` | Nodes with medium-severity weAudit findings |
| `weaudit:low` | Nodes with low-severity weAudit findings |
| `weaudit:findings` | All weAudit findings (entryType=0) |
| `weaudit:notes` | All weAudit notes (entryType=1) |
## How Matching Works
Findings are matched to graph nodes by file path and line range overlap:
1. Finding file path is normalized relative to the graph's `root_path`
2. Nodes whose `location.file_path` matches AND whose line range overlaps are
selected
3. The tightest match (smallest span) is preferred
4. If a finding's location doesn't overlap any node, it counts as unmatched
SARIF paths may be relative, absolute, or `file://` URAudits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.
Clarify requirements before implementing. Use when serious doubts arise.
Enables ultra-granular, line-by-line code analysis to build deep architectural context before vulnerability or bug finding.
Scans Algorand smart contracts for 11 common vulnerabilities including rekeying attacks, unchecked transaction fees, missing field validations, and access control issues. Use when auditing Algorand projects (TEAL/PyTeal).
Prepares codebases for security review using Trail of Bits' checklist. Helps set review goals, runs static analysis tools, increases test coverage, removes dead code, ensures accessibility, and generates documentation (flowcharts, user stories, inline comments).
Scans Cairo/StarkNet smart contracts for 6 critical vulnerabilities including felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, and signature replay. Use when auditing StarkNet projects.
Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.
Scans Cosmos SDK blockchain modules and CosmWasm contracts for consensus-critical vulnerabilities — chain halts, fund loss, state divergence. 25 core + 16 IBC + 10 EVM + 3 CosmWasm patterns. Use when auditing custom x/ modules, reviewing IBC integrations, or assessing pre-launch chain security. Updated for SDK v0.53.x.