review-llm-artifacts
Detects common LLM coding agent artifacts across four categories (tests, dead code, abstraction, style) over the project or changed files — using parallel subagents when the agent supports them, otherwise four sequential passes. Scans files changed since main by default; use --all for full-project scan. Triggers on LLM cruft cleanup, agent-generated code review, dead code sweeps, test-quality passes, or when the user asks to scan the whole repo.
git clone --depth 1 https://github.com/existential-birds/beagle /tmp/review-llm-artifacts && cp -r /tmp/review-llm-artifacts/plugins/beagle-core/skills/review-llm-artifacts ~/.claude/skills/review-llm-artifactsSKILL.md
# LLM Artifacts Review
Detect common artifacts left behind by LLM coding agents: over-abstraction, dead code, DRY violations in tests, verbose comments, and defensive overkill.
## Hard gates (sequence)
Advance only when each **pass condition** is objectively true (prevents “review complete” without artifacts):
| Gate | Pass condition |
|------|----------------|
| **G1 — Scope** | File list is non-empty *or* you exit with exactly the Step 1 message; `scope` is set to `all` or `changed`. |
| **G2 — Four categories** | Tests, dead code, abstraction, and style are each reviewed (four parallel subagent runs when supported, or four sequential passes covering the same categories). **Stop** if any category did not complete; do not write JSON or a summary that implies a full pass. |
| **G3 — JSON before summary** | `.beagle/llm-artifacts-review.json` exists and is valid JSON **before** Step 6 markdown. |
| **G4 — Integrity** | Step 7 checks pass before treating the run as complete. |
## Arguments
Parse `$ARGUMENTS` for flags and optional path:
| Flag | Effect |
|------|--------|
| *(default)* | **Changed-files scope** — only files changed since `git merge-base HEAD main` (PR-style scope) |
| `--all` | Full project scan — all matching source files under the target path |
| `--parallel` | Force parallel execution where subagents are supported (default when 4+ files in scope) |
| Path | Root directory to scan (default: current working directory) |
## Step 1: Determine Scope
**A. Changed files only (default):**
Resolve the base ref explicitly and fail loudly if none exists — **do not** wrap the `git merge-base` call in `|| true`, which would silently swallow a missing `main`/`master` ref and report "no files to scan" on repos that only have `origin/main` or use `master`. If no base ref is found, suggest the user pass `--all` instead of silently falling back.
```bash
BASE=$(for ref in main origin/main master origin/master; do
git rev-parse --verify "$ref" >/dev/null 2>&1 && { echo "$ref"; break; }
done)
if [ -z "$BASE" ]; then
echo "error: no main/master ref found (checked main, origin/main, master, origin/master). Pass --all for a full-project scan." >&2
exit 1
fi
MERGE_BASE=$(git merge-base HEAD "$BASE") || {
echo "error: git merge-base HEAD $BASE failed." >&2
exit 1
}
git diff --name-only "$MERGE_BASE..HEAD" | grep -E '\.(py|ts|tsx|js|jsx|go|rs|java|rb|swift|kt)$' || true
```
(The trailing `|| true` on the `grep` is intentional — zero source-file matches is a legitimate empty-scope result, distinct from a failed base-ref resolution.)
**B. Full project (`--all`):**
From `TARGET` (default `.`), list source files and **prune** excluded dependency/build trees so `find` never descends into them. `! -path "*/foo/*"` only filters the output; `find` still walks the tree (minutes of wasted I/O on large `node_modules`, `target`, etc.). Use `-prune` instead:
```bash
find "$TARGET" \
\( -type d \( \
-name node_modules -o -name .git -o -name vendor -o -name __pycache__ \
-o -name .venv -o -name venv -o -name dist -o -name build \
-o -name target -o -name .next -o -name coverage -o -name .turbo \
\) -prune \) -o \
\( -type f \( \
-name "*.py" -o -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" \
-o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" \
-o -name "*.swift" -o -name "*.kt" \
\) -print \)
```
**Large repos:** The `--all` path can produce huge file lists. If file count exceeds **400**, warn and suggest narrowing: pass a subdirectory as `TARGET`, or drop `--all` to fall back to the default changed-files scope. Still proceed unless the user explicitly cancels. (This warning does **not** fire on the default changed-files scope, which is already bounded by the PR diff.)
If no files are found, exit with:
`No files to scan. Check the path, branch, or pass --all for a full-project scan.`
Set `scope` in the report: `"all"` for `--all`, `"changed"` for the default changed-files scope.
## Step 2: Detect Languages
Extract unique file extensions from the file list:
```bash
echo "$FILES" | sed 's/.*\.//' | sort -u
```
Map extensions to language names for the report:
- `.py` -> Python
- `.ts`, `.tsx` -> TypeScript
- `.js`, `.jsx` -> JavaScript
- `.go` -> Go
- `.rs` -> Rust
- `.java` -> Java
- `.rb` -> Ruby
- `.swift` -> Swift
- `.kt` -> Kotlin
## Step 3: Review the Four Categories
Cover all four categories below. **If the agent supports subagents** and file count >= 4 (or `--parallel` is set), dispatch one subagent per category in parallel. **Otherwise**, run the four category reviews sequentially yourself, producing the same findings. Either way:
1. Load the [llm-artifacts-detection](../llm-artifacts-detection/SKILL.md) skill
2. Review each category (one per subagent when parallel, one pass at a time when sequential)
3. Collect findings in the structured format below
### Category 1: Tests
**Focus:** Testing anti-patterns from LLM generation
- DRY violations (repeated setup code, duplicate assertions)
- Testing library/framework code instead of application logic
- Wrong mock boundaries (mocking too much or too little)
- Overly verbose test names that describe implementation
- Tests that just mirror the implementation
### Category 2: Dead Code
**Focus:** Unused or obsolete code
- Unused imports, variables, functions, classes
- TODO/FIXME comments that should have been resolved
- Backwards compatibility code for removed features
- Orphaned test files for deleted code
- Commented-out code blocks
- Feature flags that are always on/off
### Category 3: Abstraction
**Focus:** Over-engineering patterns
- Unnecessary abstraction layers (interfaces for single implementations)
- Copy-paste drift (similar code that diverged slightly)
- Over-configuration (configurable things that never change)
- Premature generalization
- Factory/Builder patterns for simple object creation
- Deep inheritancetag and push a release after the release PR is merged
create a release PR (auto-detects previous tag)
Guides architectural decisions for Deep Agents applications. Use when deciding between Deep Agents vs alternatives, choosing backend strategies, designing subagent systems, or selecting middleware approaches.
Reviews Deep Agents code for bugs, anti-patterns, and improvements. Use when reviewing code that uses create_deep_agent, backends, subagents, middleware, or human-in-the-loop patterns. Catches common configuration and usage mistakes.
Implements agents using Deep Agents. Use when building agents with create_deep_agent, configuring backends, defining subagents, adding middleware, or setting up human-in-the-loop workflows.
Guides architectural decisions for LangGraph applications. Use when deciding between LangGraph vs alternatives, choosing state management strategies, designing multi-agent systems, or selecting persistence and streaming approaches.
Reviews LangGraph code for bugs, anti-patterns, and improvements. Use when reviewing code that uses StateGraph, nodes, edges, checkpointing, or other LangGraph features. Catches common mistakes in state management, graph structure, and async patterns.
Implements stateful agent graphs using LangGraph. Use when building graphs, adding nodes/edges, defining state schemas, implementing checkpointing, handling interrupts, or creating multi-agent systems with LangGraph.