Skip to main content
ClaudeWave
Skill5.7k repo starsupdated yesterday

graph-evolution

Graph Evolution builds Trailmark code graphs at two source snapshots and computes a structural diff to surface security-relevant changes that text-level diffs miss, including new attack paths, complexity shifts, blast radius growth, taint propagation changes, and privilege boundary modifications. Use this skill when comparing git refs to understand structural code changes, auditing commit ranges for security evolution, detecting newly created attack paths, identifying functions with silently growing blast radius or complexity, and tracking taint propagation changes across refactors.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/trailofbits/skills /tmp/graph-evolution && cp -r /tmp/graph-evolution/plugins/trailmark/skills/graph-evolution ~/.claude/skills/graph-evolution
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Graph Evolution

Builds Trailmark code graphs at two source snapshots and computes a
structural diff. Surfaces security-relevant changes that text-level
diffs miss: new attack paths, complexity shifts, blast radius growth,
taint propagation changes, and privilege boundary modifications.

## When to Use

- Comparing two git refs to understand what structurally changed
- Auditing a range of commits for security-relevant evolution
- Detecting new attack paths created by code changes
- Finding functions whose blast radius or complexity grew silently
- Identifying taint propagation changes across refactors
- Pre-release structural comparison (tag-to-tag or branch-to-branch)

## When NOT to Use

- Line-level code review (use `differential-review` for text-diff analysis)
- Single-snapshot analysis (use the `trailmark` skill directly)
- Diagram generation from a single snapshot (use the `diagramming-code` skill)
- Mutation testing triage (use the `genotoxic` skill)

## Rationalizations to Reject

| Rationalization | Why It's Wrong | Required Action |
|-----------------|----------------|-----------------|
| "We just need the structural diff, skip pre-analysis" | Without pre-analysis, you miss taint changes, blast radius growth, and privilege boundary shifts | Run `engine.preanalysis()` on both snapshots |
| "Text diff covers what changed" | Text diffs miss new attack paths, transitive complexity shifts, and subgraph membership changes | Use structural diff to complement text diff |
| "Only added nodes matter" | Removed security functions and shifted privilege boundaries are equally dangerous | Review removals and modifications, not just additions |
| "Low-severity structural changes can be ignored" | INFO-level changes (dead code removal) can mask removed security checks | Classify every change, review removals for replaced functionality |
| "One snapshot's graph is enough for comparison" | Single-snapshot analysis can't detect evolution — you need both before and after | Always build and export both graphs |
| "Tool isn't installed, I'll compare manually" | Manual comparison misses what graph analysis catches | Install trailmark first |

---

## Prerequisites

**trailmark** must be installed. If `uv run trailmark` fails, run:

```bash
uv pip install trailmark
```

**DO NOT** fall back to "manual comparison" or reading source files as a
substitute for running trailmark. The tool must be installed and used
programmatically. If installation fails, report the error.

---

## Quick Start

```bash
# Compare two git refs (e.g., tags, branches, commits)
# 1. Build graphs at each snapshot
# 2. Run pre-analysis on both
# 3. Compute structural diff
# 4. Generate report

# Step-by-step: see Workflow below
```

---

## Decision Tree

```
├─ Need to understand what each metric means?
│  └─ Read: references/evolution-metrics.md
│
├─ Need the report output format?
│  └─ Read: references/report-format.md
│
├─ Already have two graph JSON exports?
│  └─ Jump to Phase 3 (run native diff + graph_diff.py)
│
└─ Starting from two git refs?
   └─ Start at Phase 1
```

---

## Workflow

```
Graph Evolution Progress:
- [ ] Phase 1: Create snapshots (git worktrees)
- [ ] Phase 2: Build graphs + pre-analysis on both snapshots
- [ ] Phase 3: Compute structural diff
- [ ] Phase 4: Interpret diff and generate report
- [ ] Phase 5: Clean up worktrees
```

### Phase 1: Create Snapshots

Use git worktrees to get clean copies of each ref without disturbing
the working tree.

```bash
# Create temp directories for worktrees
BEFORE_DIR=$(mktemp -d)
AFTER_DIR=$(mktemp -d)

# Create worktrees (run from repo root)
git worktree add "$BEFORE_DIR" {before_ref}
git worktree add "$AFTER_DIR" {after_ref}
```

If comparing two directories instead of git refs, skip this phase and
use the directory paths directly in Phase 2.

### Phase 2: Build Graphs and Run Pre-Analysis

Build Trailmark graphs for both snapshots and run pre-analysis on each.
Pre-analysis computes blast radius, taint propagation, privilege
boundaries, and entrypoint enumeration.

```python
from trailmark.query.api import QueryEngine

def build_and_export(target_dir, output_path, language="auto"):
    """Build graph, run pre-analysis, export JSON."""
    engine = QueryEngine.from_directory(target_dir, language=language)
    engine.preanalysis()
    json_str = engine.to_json()
    with open(output_path, "w") as f:
        f.write(json_str)
    return engine.summary()

import tempfile, os
work_dir = tempfile.mkdtemp(prefix="trailmark_evolution_")
before_json = os.path.join(work_dir, "before_graph.json")
after_json = os.path.join(work_dir, "after_graph.json")

before_summary = build_and_export(
    "{before_dir}", before_json
)
after_summary = build_and_export(
    "{after_dir}", after_json
)
```

Verify both graphs built successfully by checking the summary output.
If either fails, rerun with an explicit language or comma-separated list
instead of `auto`.

### Phase 3: Compute Structural Diff

Run **both**:

1. Trailmark's native structural diff for nodes, edges, and entrypoints
2. The plugin's `graph_diff.py` helper for subgraph membership changes

Using the same `work_dir` from Phase 2:

```bash
trailmark diff --json "{before_dir}" "{after_dir}" > "{work_dir}/trailmark_diff.json" || \
  uv run trailmark diff --json "{before_dir}" "{after_dir}" > "{work_dir}/trailmark_diff.json"

uv run {baseDir}/scripts/graph_diff.py \
    --before "{before_json}" \
    --after "{after_json}" > "{work_dir}/subgraph_diff.json"
```

If either diff command fails or writes an empty JSON file, stop and report the
error instead of continuing to Phase 4.

The native Trailmark diff contains:

| Key | Contents |
|-----|----------|
| `summary_delta` | Changes in node/edge/entrypoint counts |
| `nodes.added` | New functions, classes, methods |
| `nodes.removed` | Deleted functions, classes, methods |
| `nodes.modified` | Functions with changed CC, params, line span |
| `edges.added` | New call/inheritan
agentic-actions-auditorSkill

Audits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.

ask-questions-if-underspecifiedSkill

Clarify requirements before implementing. Use when serious doubts arise.

audit-context-buildingSkill

Enables ultra-granular, line-by-line code analysis to build deep architectural context before vulnerability or bug finding.

algorand-vulnerability-scannerSkill

Scans Algorand smart contracts for 11 common vulnerabilities including rekeying attacks, unchecked transaction fees, missing field validations, and access control issues. Use when auditing Algorand projects (TEAL/PyTeal).

audit-prep-assistantSkill

Prepares codebases for security review using Trail of Bits' checklist. Helps set review goals, runs static analysis tools, increases test coverage, removes dead code, ensures accessibility, and generates documentation (flowcharts, user stories, inline comments).

cairo-vulnerability-scannerSkill

Scans Cairo/StarkNet smart contracts for 6 critical vulnerabilities including felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, and signature replay. Use when auditing StarkNet projects.

code-maturity-assessorSkill

Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations.

cosmos-vulnerability-scannerSkill

Scans Cosmos SDK blockchain modules and CosmWasm contracts for consensus-critical vulnerabilities — chain halts, fund loss, state divergence. 25 core + 16 IBC + 10 EVM + 3 CosmWasm patterns. Use when auditing custom x/ modules, reviewing IBC integrations, or assessing pre-launch chain security. Updated for SDK v0.53.x.