Skip to main content
ClaudeWave
Skill2.9k repo starsupdated 17d ago

test-driven-development

**test-driven-development** enforces test-first development across three entry points: parsing acceptance criteria from `.spec.md` files, reading requirements from `.code-task.md` files, or clarifying ad-hoc descriptions. The skill generates failing test stubs, discovers existing repository patterns through code search, then guides developers through the RED-GREEN-REFACTOR cycle using repository testing conventions, proptest support, and backpressure integration. Use this skill when beginning any new feature or bug fix that requires rigorous test coverage before implementation.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/mikeyobrien/ralph-orchestrator /tmp/test-driven-development && cp -r /tmp/test-driven-development/.claude/skills/test-driven-development ~/.claude/skills/test-driven-development
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Test-Driven Development

## Overview

One skill for all TDD workflows. Enforces test-first development using existing repository patterns. Three input modes handle different entry points — specs, task files, or ad-hoc descriptions — but the core cycle is always RED → GREEN → REFACTOR.

## Input Modes

Detect the input type and follow the corresponding mode:

### Mode A: From Spec (`.spec.md`)

Use when the input references a `.spec.md` file with Given/When/Then acceptance criteria.

1. **Locate and parse** the spec file — extract all Given/When/Then triples
2. **Generate one test stub per criterion** with `todo!()` bodies:
   ```rust
   /// Spec: <spec-file> — Criterion #<N>
   /// Given <given text>
   /// When <when text>
   /// Then <then text>
   #[test]
   fn <spec_name>_criterion_<N>_<slug>() {
       todo!("Implement: <then text>");
   }
   ```
3. **Verify stubs compile** but fail: `cargo test --no-run -p <crate>`
4. Proceed to the [TDD Cycle](#tdd-cycle) to make stubs pass

**Programmatic support:** `ralph_core::preflight::{extract_acceptance_criteria, extract_criteria_from_file, extract_all_criteria}` can parse criteria from spec files.

### Mode B: From Task (`.code-task.md`)

Use when the input references a `.code-task.md` file or a specific implementation task.

1. **Read the task** and identify acceptance criteria or requirements
2. **Discover patterns** (see [Pattern Discovery](#pattern-discovery))
3. **Design test scenarios** covering normal operation, edge cases, and error conditions
4. **Write failing tests** for all requirements before any implementation
5. Proceed to the [TDD Cycle](#tdd-cycle)

### Mode C: From Description

Use for ad-hoc tasks without a spec or task file.

1. **Clarify requirements** from the description
2. **Discover patterns** (see [Pattern Discovery](#pattern-discovery))
3. **Write failing tests** targeting the described behavior
4. Proceed to the [TDD Cycle](#tdd-cycle)

## Pattern Discovery

Before writing tests, discover existing conventions:

```bash
rg --files -g "crates/*/tests/*.rs"
rg -n "#\[cfg\(test\)\]" crates/
```

Read 2-3 relevant test files near the target code. Mirror:
- Test module layout, naming, and assertion style
- Fixture helpers and test utilities
- Use of `tempfile`, scenarios, or harnesses

## TDD Cycle

### 1) RED — Failing Tests

- Write tests for the exact behavior required
- Run tests to confirm failure **for the right reason**
- If tests pass without implementation, the test is wrong

### 2) GREEN — Minimal Implementation

- Write the minimum code to make tests pass
- No extra features or refactoring during this step

### 3) REFACTOR — Clean Up

- Improve implementation and tests while keeping tests green
- Align with surrounding codebase conventions
- Re-run tests after every change

## Proptest Guidance

Use `proptest` only when ALL of:
- Function is pure (no I/O, no time, no globals)
- Deterministic output for given input
- Non-trivial input space or edge cases

```rust
proptest! {
    #[test]
    fn round_trip(input in "[a-z0-9]{0,32}") {
        let encoded = encode(input.as_str());
        let decoded = decode(&encoded).expect("should decode");
        prop_assert_eq!(decoded, input);
    }
}
```

Don't introduce proptest as a new dependency without strong justification.

## Backpressure Integration

Include coverage evidence in completion events:

```bash
ralph emit "build.done" "tests: pass, lint: pass, typecheck: pass, audit: pass, coverage: pass (82%)"
```

Run `cargo tarpaulin --out Html --output-dir coverage --skip-clean` when feasible. If coverage cannot be run, state why and include targeted test evidence instead.

## Test Location Rules

- Spec maps to a single module → inline `#[cfg(test)]` tests
- Spec spans multiple modules → integration test in `crates/<crate>/tests/`
- CLI behavior → `crates/ralph-cli/tests/`
- Follow existing patterns in the target crate

## Anti-Patterns

- Writing implementation before tests
- Generating tests that pass without implementation
- Copying tests from other crates without adapting to local patterns
- Adding proptest when a simple example test suffices
- Emitting completion events without coverage evidence
code-assistSkill

Guides implementation of code tasks using test-driven development in an Explore, Plan, Code, Commit workflow. Acts as a Technical Implementation Partner and TDD Coach — following existing patterns, avoiding over-engineering, and producing idiomatic, modern code.

ralph-e2e-verifierSubagent

Use this agent when you need to run the Ralph orchestrator end-to-end test suite, analyze diagnostic outputs, and generate comprehensive reports of findings. This includes validating backend connectivity, orchestration loop behavior, event parsing, hat collections, memory systems, and error handling. Invoke this agent after making changes to core orchestration logic, before releases, or when debugging integration issues.\\n\\nExamples:\\n\\n<example>\\nContext: User has made changes to the event parsing logic and wants to verify nothing is broken.\\nuser: \"I just modified the event parsing in ralph-core, can you verify everything still works?\"\\nassistant: \"I'll use the ralph-e2e-verifier agent to run the full E2E test suite and analyze the results.\"\\n<Task tool invocation to launch ralph-e2e-verifier>\\n</example>\\n\\n<example>\\nContext: User is preparing a release and needs validation.\\nuser: \"We're preparing to release v0.5.0, please run the E2E tests\"\\nassistant: \"I'll launch the ralph-e2e-verifier agent to run comprehensive E2E tests across all backends and generate a release readiness report.\"\\n<Task tool invocation to launch ralph-e2e-verifier>\\n</example>\\n\\n<example>\\nContext: User notices orchestration issues and wants diagnostics analyzed.\\nuser: \"Ralph seems to be selecting the wrong hats, can you investigate?\"\\nassistant: \"I'll use the ralph-e2e-verifier agent to run E2E tests with diagnostics enabled and analyze the hat selection decisions.\"\\n<Task tool invocation to launch ralph-e2e-verifier>\\n</example>

ralph-loop-runnerSubagent

Use this agent when you need to execute a Ralph orchestration loop end-to-end and verify its completion. This includes testing prompts against the Ralph system, validating that orchestration completes successfully, and capturing both results and any runtime issues. Examples:\\n\\n<example>\\nContext: User wants to test if a prompt works correctly with Ralph orchestration.\\nuser: \"Test if Ralph can handle the prompt 'create a hello world function'\"\\nassistant: \"I'll use the ralph-loop-runner agent to execute this prompt through Ralph and verify completion.\"\\n<Task tool call to ralph-loop-runner agent>\\n</example>\\n\\n<example>\\nContext: User is debugging why a Ralph run failed.\\nuser: \"Run this spec through Ralph and tell me what went wrong\"\\nassistant: \"Let me use the ralph-loop-runner agent to execute this and capture any runtime problems.\"\\n<Task tool call to ralph-loop-runner agent>\\n</example>\\n\\n<example>\\nContext: User wants to validate Ralph behavior after code changes.\\nuser: \"I just modified the event parser, can you run a test loop?\"\\nassistant: \"I'll use the ralph-loop-runner agent to run a complete orchestration loop and verify the changes work correctly.\"\\n<Task tool call to ralph-loop-runner agent>\\n</example>

code-task-generatorSkill

Generates structured .code-task.md files from descriptions or PDD implementation plans. Auto-detects input type, creates properly formatted tasks with Given-When-Then acceptance criteria.

evaluate-presetsSkill

Use when testing Ralph's hat collection presets, validating preset configurations, or auditing the preset library for bugs and UX issues.

find-code-tasksSkill

Lists all code tasks in the repository with their status, dates, and metadata. Useful for getting an overview of pending work or finding specific tasks.

pddSkill

Transforms a rough idea into a detailed design document with implementation plan. Follows Prompt-Driven Development — iterative requirements clarification, research, design, and planning.

playwriterSkill

Browser automation via Playwriter (remorses) using persistent Chrome sessions and the full Playwright Page API.