Skip to main content
ClaudeWave
Skill5.7k repo starsupdated yesterday

audit-augmentation

Audit Augmentation imports security findings from external static analysis tools like Semgrep and CodeQL (in SARIF format) or human audit annotations (weAudit format) into Trailmark code graphs as annotations and subgraphs. Use this skill to overlay vulnerability findings onto code structure, cross-reference findings with taint analysis and blast radius data, and query which functions contain high-severity issues.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/trailofbits/skills /tmp/audit-augmentation && cp -r /tmp/audit-augmentation/plugins/trailmark/skills/audit-augmentation ~/.claude/skills/audit-augmentation
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Audit Augmentation

Projects findings from external tools (SARIF) and human auditors (weAudit)
onto Trailmark code graphs as annotations and subgraphs.

## When to Use

- Importing Semgrep, CodeQL, or other SARIF-producing tool results into a graph
- Importing weAudit audit annotations into a graph
- Cross-referencing static analysis findings with blast radius or taint data
- Querying which functions have high-severity findings
- Visualizing audit coverage alongside code structure

## When NOT to Use

- Running static analysis tools (use semgrep/codeql directly, then import)
- Building the code graph itself (use the `trailmark` skill)
- Generating diagrams (use the `diagramming-code` skill after augmenting)

## Rationalizations to Reject

| Rationalization | Why It's Wrong | Required Action |
|-----------------|----------------|-----------------|
| "The user only asked about SARIF, skip pre-analysis" | Without pre-analysis, you can't cross-reference findings with blast radius or taint | Always run `engine.preanalysis()` before augmenting |
| "Unmatched findings don't matter" | Unmatched findings may indicate parsing gaps or out-of-scope files | Report unmatched count and investigate if high |
| "One severity subgraph is enough" | Different severities need different triage workflows | Query all severity subgraphs, not just `error` |
| "SARIF results speak for themselves" | Findings without graph context lack blast radius and taint reachability | Cross-reference with pre-analysis subgraphs |
| "weAudit and SARIF overlap, pick one" | Human auditors and tools find different things | Import both when available |
| "Tool isn't installed, I'll do it manually" | Manual analysis misses what tooling catches | Install trailmark first |

---

## Installation

**MANDATORY:** If `uv run trailmark` fails, install trailmark first:

```bash
uv pip install trailmark
```

## Quick Start

### CLI

```bash
# Augment with SARIF
uv run trailmark augment {targetDir} --sarif results.sarif

# Augment with weAudit
uv run trailmark augment {targetDir} --weaudit .vscode/alice.weaudit

# Both at once, output JSON
uv run trailmark augment {targetDir} \
    --sarif results.sarif \
    --weaudit .vscode/alice.weaudit \
    --json
```

### Programmatic API

```python
from trailmark.query.api import QueryEngine

engine = QueryEngine.from_directory("{targetDir}", language="auto")

# Run pre-analysis first for cross-referencing
engine.preanalysis()

# Augment with SARIF
result = engine.augment_sarif("results.sarif")
# result: {matched_findings: 12, unmatched_findings: 3, subgraphs_created: [...]}

# Augment with weAudit
result = engine.augment_weaudit(".vscode/alice.weaudit")

# Query findings
engine.findings()                                       # All findings
engine.subgraph("sarif:error")                          # High-severity SARIF
engine.subgraph("weaudit:high")                         # High-severity weAudit
engine.subgraph("sarif:semgrep")                        # By tool name
engine.annotations_of("function_name")                  # Per-node lookup
```

If auto-detection is wrong for the target, rerun with an explicit language or
comma-separated list such as `python,rust`.

## Workflow

```
Augmentation Progress:
- [ ] Step 1: Build graph and run pre-analysis
- [ ] Step 2: Locate SARIF/weAudit files
- [ ] Step 3: Run augmentation
- [ ] Step 4: Inspect results and subgraphs
- [ ] Step 5: Cross-reference with pre-analysis
```

**Step 1:** Build the graph and run pre-analysis for blast radius and taint
context:

```python
engine = QueryEngine.from_directory("{targetDir}", language="auto")
engine.preanalysis()
```

If auto-detection is wrong for the target, rerun with an explicit language or
comma-separated list such as `python,rust`.

**Step 2:** Locate input files:
- **SARIF**: Usually output by tools like `semgrep --sarif -o results.sarif`
  or `codeql database analyze --format=sarif-latest`
- **weAudit**: Stored in `.vscode/<username>.weaudit` within the workspace

**Step 3:** Run augmentation via `engine.augment_sarif()` or
`engine.augment_weaudit()`. Check `unmatched_findings` in the result — these
are findings whose file/line locations didn't overlap any parsed code unit.

**Step 4:** Query findings and subgraphs. Use `engine.findings()` to list all
annotated nodes. Use `engine.subgraph_names()` to see available subgraphs.

**Step 5:** Cross-reference with pre-analysis data to prioritize:
- Findings on tainted nodes: overlap `sarif:error` with `tainted` subgraph
- Findings on high blast radius nodes: overlap with `high_blast_radius`
- Findings on privilege boundaries: overlap with `privilege_boundary`

## Annotation Format

Findings are stored as standard Trailmark annotations:

- **Kind**: `finding` (tool-generated) or `audit_note` (human notes)
- **Source**: `sarif:<tool_name>` or `weaudit:<author>`
- **Description**: Compact single-line:
  `[SEVERITY] rule-id: message (tool)`

## Subgraphs Created

| Subgraph | Contents |
|----------|----------|
| `sarif:error` | Nodes with SARIF error-level findings |
| `sarif:warning` | Nodes with SARIF warning-level findings |
| `sarif:note` | Nodes with SARIF note-level findings |
| `sarif:<tool>` | Nodes flagged by a specific tool |
| `weaudit:high` | Nodes with high-severity weAudit findings |
| `weaudit:medium` | Nodes with medium-severity weAudit findings |
| `weaudit:low` | Nodes with low-severity weAudit findings |
| `weaudit:findings` | All weAudit findings (entryType=0) |
| `weaudit:notes` | All weAudit notes (entryType=1) |

## How Matching Works

Findings are matched to graph nodes by file path and line range overlap:

1. Finding file path is normalized relative to the graph's `root_path`
2. Nodes whose `location.file_path` matches AND whose line range overlaps are
   selected
3. The tightest match (smallest span) is preferred
4. If a finding's location doesn't overlap any node, it counts as unmatched

SARIF paths may be relative, absolute, or `file://` UR
agentic-actions-auditorSkill

Audits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.

ask-questions-if-underspecifiedSkill

Clarify requirements before implementing. Use when serious doubts arise.

audit-context-buildingSkill

Enables ultra-granular, line-by-line code analysis to build deep architectural context before vulnerability or bug finding.

algorand-vulnerability-scannerSkill

Scans Algorand smart contracts for 11 common vulnerabilities including rekeying attacks, unchecked transaction fees, missing field validations, and access control issues. Use when auditing Algorand projects (TEAL/PyTeal).

audit-prep-assistantSkill

Prepares codebases for security review using Trail of Bits' checklist. Helps set review goals, runs static analysis tools, increases test coverage, removes dead code, ensures accessibility, and generates documentation (flowcharts, user stories, inline comments).

cairo-vulnerability-scannerSkill

Scans Cairo/StarkNet smart contracts for 6 critical vulnerabilities including felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, and signature replay. Use when auditing StarkNet projects.

code-maturity-assessorSkill

Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.

cosmos-vulnerability-scannerSkill

Scans Cosmos SDK blockchain modules and CosmWasm contracts for consensus-critical vulnerabilities — chain halts, fund loss, state divergence. 25 core + 16 IBC + 10 EVM + 3 CosmWasm patterns. Use when auditing custom x/ modules, reviewing IBC integrations, or assessing pre-launch chain security. Updated for SDK v0.53.x.