agenttrace-session-audit
agenttrace-session-audit inspects local AI coding-agent sessions to analyze token consumption, cost spikes, tool failures, latency patterns, health scores, and anomalies across multiple agent platforms including Claude Code and Aider. Use this skill when investigating why an AI coding run was slow, expensive, or unreliable, or when building CI health gates and comparing session attempts to detect semantic drift and implementation divergence.
git clone --depth 1 https://github.com/sickn33/antigravity-awesome-skills /tmp/agenttrace-session-audit && cp -r /tmp/agenttrace-session-audit/plugins/antigravity-awesome-skills-claude/skills/agenttrace-session-audit ~/.claude/skills/agenttrace-session-auditSKILL.md
# agenttrace Session Audit ## Overview Use this skill to inspect local AI coding-agent sessions with [agenttrace](https://github.com/luoyuctl/agenttrace). It focuses on the process behind a run: token and cost spikes, tool failures, retry loops, latency gaps, anomalies, health scores, and session-to-session diffs. agenttrace is local-first and reads session logs from tools such as Claude Code, Codex CLI, Gemini CLI, Aider, Cursor exports, OpenCode, Qwen Code, Kimi, and generic JSON or JSONL traces. ## When to Use This Skill - Use when a user asks why an AI coding run was slow, expensive, shallow, or unreliable. - Use when reviewing local agent logs before retrying a failed or suspicious task. - Use when building a lightweight CI health gate for AI-assisted coding sessions. - Use when comparing two attempts and looking for changed tool paths, retries, or cost patterns. ## How It Works ### Step 1: Discover Available Sessions Prefer an installed `agenttrace` binary when it is available on `PATH`. If the current repository is `luoyuctl/agenttrace`, use `go run ./cmd/agenttrace` instead. ```bash agenttrace --doctor agenttrace --overview ``` If no sessions are detected, report the directories checked by `--doctor` and ask for the exported session file or log directory. ### Step 2: Produce a Human-Readable Audit Use Markdown when the user wants a concise report they can inspect or share. ```bash agenttrace --overview -f markdown -o agenttrace-overview.md ``` In the report, lead with the highest-risk sessions and explain why they matter: critical anomalies, repeated tool failures, token or cost waste, long latency gaps, low health scores, and suspiciously shallow sessions. ### Step 3: Inspect One Session or Directory Use the latest session for a quick check, or pass an explicit export path when the user provides one. ```bash agenttrace --latest agenttrace --latest -f json agenttrace path/to/session-or-export.json agenttrace --overview -d path/to/session-dir ``` ### Step 4: Compare Attempts When Semantics Matter Token and latency metrics can look healthy even when an agent confidently takes the wrong implementation path. When the risk is semantic drift, pair the trace audit with a diff against a previous or known-good attempt. Look for: - changed files or commands that diverge from the intended task - missing tests or verification steps compared with the reference attempt - repeated edits around the same files without a clear reason - lower cost that came from skipping necessary exploration ### Step 5: Add Automation Gates For CI or repeatable team workflows, use JSON output or health thresholds. ```bash agenttrace --overview -f json -o agenttrace-overview.json agenttrace --overview --fail-under-health 80 --fail-on-critical --max-tool-fail-rate 15 ``` Tune thresholds to the project. A strict gate is useful for critical workflows; a reporting-only command is better while the team is learning its baseline. ## Examples ### Quick Local Review ```bash agenttrace --overview agenttrace --latest ``` Use this after a long coding-agent run to decide whether the next prompt should split the task, avoid a failing tool path, add missing tests, or reset context. ### CI Health Check ```bash agenttrace --overview --fail-under-health 80 --fail-on-critical ``` Use this when agent session logs are available in CI and the team wants a simple guard against critical anomalies or unhealthy runs. ## Best Practices - Start with `--doctor` when session discovery is uncertain. - Report missing fields plainly; do not invent cost, model, latency, or health data. - Treat prompts, code, and session contents as private local data. - Prefer JSON output for automation and Markdown output for human review. - Use trace metrics for process failures and diff/reference review for semantic drift. ## Limitations - agenttrace can only analyze logs that are present locally or provided as exports. - Some agents do not expose enough fields to infer cost, model, cache use, or latency. - Healthy trace metrics do not prove the final code is correct; still run tests and review diffs. - CI gates should start as advisory until the team understands normal baseline behavior. ## Security & Safety Notes - Do not upload private session logs to external services unless the user explicitly approves it. - Do not overwrite user reports unless they requested that exact output path. - Avoid printing secrets found in prompts, tool output, environment variables, or logs. ## Common Pitfalls - **Problem:** No sessions are found. **Solution:** Run `agenttrace --doctor`, then point agenttrace at the exported file or log directory. - **Problem:** A run looks cheap and fast but produced the wrong refactor. **Solution:** Compare the session against a prior attempt or known-good diff; cost metrics alone will miss semantic drift. - **Problem:** CI fails too often after adding a health gate. **Solution:** Start with JSON or Markdown reporting, inspect normal baselines, then tighten thresholds gradually. ## Related Skills - `@langfuse` - Use for production LLM application tracing and evaluation. - `@observability-engineer` - Use for broader service monitoring, SLOs, and incident workflows.
Arquitecto de Soluciones Principal y Consultor Tecnológico de Andru.ia. Diagnostica y traza la hoja de ruta óptima para proyectos de IA en español.
Security audit, hardening, threat modeling (STRIDE/PASTA), Red/Blue Team, OWASP checks, code review, incident response, and infrastructure security for any project.
Ingeniero de Sistemas de Andru.ia. Diseña, redacta y despliega nuevas habilidades (skills) dentro del repositorio siguiendo el Estándar de Diamante.
Estratega de Inteligencia de Dominio de Andru.ia. Analiza el nicho específico de un proyecto para inyectar conocimientos, regulaciones y estándares únicos del sector. Actívalo tras definir el nicho.
AI-powered presentation generation via the 2slides API — create slides from text, match a reference image style, summarize documents into decks, add AI voice narration, and export pages/audio. Use for any \"make slides\", \"create a deck\", or \"slides from this document\" request.
Expert in building 3D experiences for the web - Three.js, React
Structured guide for setting up A/B tests with mandatory gates for hypothesis, metrics, and execution readiness.
Use when a coding task should be driven end-to-end from issue intake through implementation, review, deployment, and acceptance verification with minimal human re-intervention.