Skill1.1k repo starsupdated 1mo ago

vc-autoresearch

The vc:autoresearch skill runs an autonomous iterative loop that repeatedly modifies code within a defined scope and evaluates progress using a single numerical metric, rolling back changes if performance regresses. Use it when you need to improve a measurable outcome like test coverage, bundle size, or performance score through multiple automated experiments without manual intervention between iterations.

View source Repository: vibecode-pro-max-kit

Install in Claude Code

Copy

git clone --depth 1 https://github.com/withkynam/vibecode-pro-max-kit /tmp/vc-autoresearch && cp -r /tmp/vc-autoresearch/.claude/skills/vc-autoresearch ~/.claude/skills/vc-autoresearch

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# vc-autoresearch

> **Output style:** Follow `process/development-protocols/communication-standards.md` — answer-first, plain language, no unexplained jargon, TL;DR on long responses.

Reusable loop primitive. Runs: find gaps → write report → fix → check → repeat.

Used directly for spec/doc/UX hardening. Wired into PVL (plan-validate-fix loop — the fix cycle between writing a plan and approving EXECUTE) and EVL (execute-validate-fix loop — the confirmation run after EXECUTE) as the shared bookkeeping layer.

---

## When To Invoke

- **Standalone:** user says "harden this spec", "fix all lint errors", "improve test coverage"
- **PVL:** the ORCHESTRATOR invokes it when vc-validate-agent returns a first-pass CONDITIONAL/BLOCKED verdict (validate-fix loops are needed)
- **EVL:** the ORCHESTRATOR invokes it at the EVL confirmation run — unconditionally after every EXECUTE DONE, before UPDATE PROCESS

Do NOT invoke during RESEARCH or INNOVATE phases.

## Who Runs This (Loop Driver)

The **ORCHESTRATOR is the loop driver**. It executes every bookkeeping step itself:

- **Step 0 setup** — creates the task folder and the `results.tsv` tracking file.
- **Cycle counter + per-cycle iteration report** — writes one report file per loop iteration.
- **TSV row** — appends a row to `results.tsv` after each cycle.
- **Plateau/cap/regression checks** — stops the loop when no progress is made for 3 cycles (HALT_PLATEAU — no improvement after 3 tries), when the hard 10-cycle limit is hit (HALT_CAP), or when a test that was passing now fails (HALT_REGRESSION).

Subagents (vc-validate-agent, vc-tester, vc-plan-agent, vc-execute-agent) are fire-and-forget: they emit a verdict and terminate. They cannot invoke this skill on the orchestrator's behalf, cannot loop themselves, and cannot spawn each other.

If no one runs Step 0, the loop never exists and verdicts silently become "proceed" — that failure mode is exactly what this section forbids.

Per-verdict routing tables: `process/development-protocols/orchestration.md` §PVL/EVL Loop Routing.

---

## Subcommands

| Subcommand | Does | Stops when |
|---|---|---|
| `vc-autoresearch` (core) | find gaps → fix → repeat | agents find no gaps OR metric goal hit |
| `vc-autoresearch:probe` | 8 personas interrogate the corpus until saturation | no new constraints for 3 rounds |
| `vc-autoresearch:reason` | adversarial debate with blind judges until convergence | judges converge or iteration cap |
| `vc-autoresearch:evals` | analyze TSV results — trends, plateaus, recommendations | N/A (analysis only) |

Not ported (already covered by existing vc-system skills): debug → `vc-debugger`, security → `vc-security`, scenario → `vc-scenario`, predict → `vc-predict`.

---

## Parameters

| Parameter | Required | Default | Notes |
|---|---|---|---|
| `domain:` | yes | — | `spec` / `tests` / `ux` / `docs` / `plan` / `errors` |
| `corpus:` | yes | — | file glob(s) or path list to investigate |
| `verify:` | no | — | shell command that outputs a number; required for "hit the metric goal" mode |
| `target:` | no | `0` | the number `verify:` must reach (lower-is-better assumed; use `target_direction: higher` to flip) |
| `guard:` | no | — | safety shell command that must pass after every fix batch |
| `frozen_files:` | no | — | glob pattern(s); any file matching is excluded from the fix corpus and must never be modified by a fix agent |
| `max_iterations:` | no | per domain | hard cap on loop cycles |
| `severity_escalation_at:` | no | `7` | after this many iterations, stop fixing CONCERN findings (move to backlog) |
| `consecutive_all_clear:` | no | `2` | how many consecutive zero-gap iterations before SUCCESS |
| `research_agents:` | no | per domain | number of parallel research agents |
| `fix_agents:` | no | per domain | number of parallel fix agents |
| `feature:` | no | inferred | feature folder name for report output paths |
| `task_slug:` | no | auto | task folder slug; auto-generated as `autoresearch-{domain}-{YYMMDD}` |
| `auto_run:` | no | prompt | `true` = no pauses; `false` = confirm before each fix batch; under `/goal` always `true` |

---

## Canonical Domain Defaults

Full configs in `process/development-protocols/vc-autoresearch-spec.md` §Canonical domain configs.

| Domain | Research agents | Fix agents | Max iterations | Escalation at | Guard |
|---|---|---|---|---|---|
| `spec` | 2 | 3 | 15 | 7 | none |
| `tests` | 2 | 2 | 20 | — | `pnpm test` |
| `ux` | 2 | 2 | 10 | 5 | `pnpm typecheck` |
| `docs` | 1 | 2 | 8 | — | node validator script |
| `plan` | 1 | 1 | 3 | — | none |
| `errors` | 1 | 2 | 20 | — | none |
| `harness` | 2 | 2 | 10 | — | `pnpm test:runtime-harness:unit` |

_* harness full config: .claude/skills/vc-autoresearch/domains/harness.md_

---

## Loop Execution

### Step 0 — Setup

1. Parse parameters, apply domain defaults for any missing values
2. If `auto_run:` not set and NOT under `/goal`: prompt once — "Auto-run (no pauses) or confirm before each fix batch?" Choice is sticky for the full loop.
3. Create task folder: `process/features/{feature}/active/{task_slug}_{dd-mm-yy}/`
4. Initialize TSV at `{task_folder}/results.tsv` with header row and baseline row (iteration 0, gaps_found: TBD, loop_status: baseline)

### Step 1 — Research

Spawn `research_agents:` parallel agents. Each agent:
- Reads the corpus files assigned to it
- Investigates its thread list (cross-file consistency, missing cases, contradictions, undefined behaviors, etc.)
- Returns a structured gap list: **SEVERITY:** FAIL | CONCERN | OBSERVATION per finding

Collect all findings. Count: `gaps_found`, `fail_count`, `concern_count`.

Apply severity floor: if `iteration > severity_escalation_at`, discard CONCERN findings (add to backlog section of report — do not fix).

### Step 2 — Convergence check

**"Until agents find no gaps"** (no `verify:` param):
- If all agents returned zero findings above the severity floor: increment `consecutive_all_clear` counter
- If counter >= `consecu

More from this repository

code-reviewerSubagent

Comprehensive code review with scout-based edge case detection. Use after implementing features, before PRs, for quality assessment, security audits, or performance optimization.

code-simplifierSubagent

Simplifies and refines code for clarity, consistency, and maintainability while preserving all functionality. Focuses on recently modified code unless instructed otherwise.

debuggerSubagent

Use this agent when you need to investigate issues, analyze system behavior, diagnose performance problems, examine database structures, collect and analyze logs from servers or CI/CD pipelines, run tests for debugging purposes, or optimize system performance. This includes troubleshooting errors, identifying bottlenecks, analyzing failed deployments, investigating test failures, and creating diagnostic reports. Examples:\n\n<example>\nContext: The user needs to investigate why an API endpoint is returning 500 errors.\nuser: "The /api/users endpoint is throwing 500 errors"\nassistant: "I''ll use the debugger agent to investigate this issue"\n<commentary>\nSince this involves investigating an issue, use the Task tool to launch the debugger agent.\n</commentary>\n</example>\n\n<example>\nContext: The user wants to analyze why the CI/CD pipeline is failing.\nuser: "The GitHub Actions workflow keeps failing on the test step"\nassistant: "Let me use the debugger agent to analyze the CI/CD pipeline logs and identify the issue"\n<commentary>\nThis requires analyzing CI/CD logs and test failures, so use the debugger agent.\n</commentary>\n</example>\n\n<example>\nContext: The user notices performance degradation in the application.\nuser: "The application response times have increased by 300% since yesterday"\nassistant: "I''ll launch the debugger agent to analyze system behavior and identify performance bottlenecks"\n<commentary>\nPerformance analysis and bottleneck identification requires the debugger agent.\n</commentary>\n</example>

execute-agentSubagent

EXECUTE MODE - Implementing EXACTLY what was planned. Full tool access. Can only be invoked after explicit user confirmation. Use after plan is approved.

fast-mode-agentSubagent

FAST MODE - Execute compressed RIPER-5 workflow (RESEARCH + INNOVATE + PLAN) in one session, then pause for EXECUTE confirmation. Use when you want quick end-to-end solution.

git-managerSubagent

Stage, commit, and push code changes with conventional commits. Use when user says "commit", "push", or finishes a feature/fix.

innovate-agentSubagent

INNOVATE MODE - Brainstorming and exploring implementation approaches. Discusses possibilities without making decisions. Use after research is complete.

plan-agentSubagent

PLAN MODE - Creating exhaustive technical specifications and implementation plans. Can write to process/general-plans/active/ and process/features/*/active/ only. Use after approach is decided.