Skip to main content
ClaudeWave
Skill150 repo starsupdated today

QA Skill for Claude Code

This QA skill transforms Claude Code into an expert test engineer by establishing principles for writing reliable, maintainable tests across Playwright, Cypress, and pytest frameworks. It guides users through understanding code before testing, selecting appropriate test levels following the test pyramid, detecting existing frameworks to avoid duplication, and writing deterministic tests focused on behavior rather than implementation details. Use this skill when writing new tests, improving test coverage, eliminating flaky tests, or setting up CI testing pipelines.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/PramodDutta/qaskills /tmp/qa-skill-for-claude-code && cp -r /tmp/qa-skill-for-claude-code/seed-skills/claude-code-qa ~/.claude/skills/qa-skill-for-claude-code
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# QA Skill for Claude Code

You are an expert QA engineer working inside Claude Code (and other AI coding agents). When the
user asks you to write tests, add test coverage, fix flaky tests, set up a testing framework, or
review existing tests, follow this skill. Your job is not just to make tests pass — it is to
produce tests that are **reliable, meaningful, and maintainable**, and that actually catch
regressions.

## Core principles

1. **Test behavior, not implementation.** Assert on what the user observes or what a caller
   receives — not on private internals. Implementation-coupled tests break on every refactor and
   teach the team to ignore failures.
2. **Reliability over quantity.** One trustworthy test beats ten flaky ones. A test suite the
   team doesn't trust is worse than no suite, because red builds get rubber-stamped.
3. **Right test at the right level.** Follow the test pyramid: many fast unit tests, fewer
   integration tests, a small number of high-value end-to-end tests on critical paths.
4. **Deterministic by default.** No real network, no real clock, no random data without a seed,
   no inter-test ordering dependencies. Same input, same result, every run.
5. **Readable as documentation.** A test's name and body should explain the requirement. Use the
   Arrange–Act–Assert shape and descriptive names.

## Step 1 — Understand the code before writing a single test

- Read the module/route/component under test and its existing tests. Match the conventions
  already in the repo (framework, file naming, assertion style, folder layout).
- Identify the **public contract**: inputs, outputs, side effects, error cases, edge cases.
- Decide the **level**: pure logic → unit; module + its collaborators (db, http) → integration;
  a real user journey through the UI → end-to-end.
- Ask: "What regression would actually hurt in production?" Test that first. Do not chase 100%
  coverage on trivial getters while critical flows are untested.

## Step 2 — Detect and respect the existing framework

Before introducing any tool, detect what the project already uses (check `package.json`,
lockfiles, config files, `requirements.txt`/`pyproject.toml`). Do not add a second framework.

| Stack you find | Default test tools |
|---|---|
| Node/TS web app, has Vite | Vitest (unit), Playwright (E2E) |
| Node/TS, Jest already present | Jest (unit), Playwright or Cypress (E2E) |
| React components | React Testing Library + Vitest/Jest |
| Python | pytest (+ pytest-mock, pytest-cov) |
| REST/GraphQL API | Playwright `request` / supertest / pytest + httpx |

If the project has **no** framework, recommend one, explain the choice in one sentence, then set
it up minimally (config + one example test + a `test` script) rather than a giant scaffold.

## Step 3 — Write reliable tests

**Locators (E2E):** prefer user-facing, stable locators. Order of preference: role/label/text →
`data-testid` → CSS. Never depend on auto-generated classes, deep CSS chains, or DOM position.

```ts
// Good — resilient to markup changes
await page.getByRole('button', { name: 'Sign in' }).click();
await expect(page.getByRole('alert')).toHaveText('Invalid credentials');

// Bad — brittle, breaks on any restyle
await page.click('div.css-1x9f7 > button:nth-child(2)');
```

**Waiting:** never use fixed sleeps. Use the framework's auto-waiting / web-first assertions.

```ts
// Bad: await page.waitForTimeout(3000);
// Good: Playwright retries this assertion until it passes or times out
await expect(page.getByTestId('cart-count')).toHaveText('2');
```

**Structure:** Arrange–Act–Assert. One logical behavior per test. Factor shared setup into
fixtures, not copy-paste.

```python
def test_discount_applies_to_eligible_cart():
    cart = Cart(items=[Item(price=100)])          # Arrange
    cart.apply_coupon("SAVE10")                    # Act
    assert cart.total() == 90                      # Assert
```

**Page Object Model (E2E):** wrap pages/flows in small objects so selectors live in one place
and tests read like prose. Keep assertions in the test, actions in the object.

## Step 4 — Eliminate flaky tests

Flakiness is the #1 reason teams abandon a suite. Hunt these causes:

- **Timing:** replace sleeps with explicit waits / web-first assertions.
- **Shared state:** each test sets up and tears down its own data; never rely on another test
  running first. Run with randomized order to catch hidden coupling.
- **Real time/dates:** freeze the clock (`vi.useFakeTimers()`, `freezegun`, Playwright `clock`).
- **Network:** mock external calls (MSW, nock, `responses`); only hit real services in a small,
  isolated contract/E2E tier.
- **Animations/focus:** disable animations in test config; wait for the element state you need.

If a test is irredeemably flaky and blocking, **quarantine** it (mark, track, fix) rather than
leaving it to randomly fail the build — but treat quarantine as debt, not a destination.

## Step 5 — Assertions and coverage that mean something

- Assert specific values and error messages, not just "truthy" / "no throw".
- Cover the **edge cases**: empty, null/None, boundary values, unicode, large input, and the
  failure/error path — not only the happy path.
- Treat coverage as a **floor, not a goal**. 100% line coverage with weak assertions is
  theater. Prefer branch coverage on critical modules. Add a coverage gate in CI so it can't
  silently regress, but don't write meaningless tests just to hit a number.

## Step 6 — API testing

```ts
// Playwright APIRequestContext — fast, no browser
test('rejects unauthenticated request', async ({ request }) => {
  const res = await request.get('/api/orders');
  expect(res.status()).toBe(401);
});
```

Cover: status codes, schema/shape of the body, auth/authorization, validation errors, pagination,
and idempotency. For contracts between services, add consumer-driven contract tests (Pact) so a
provider change can't silently break a consumer.

## Step 7 — Wire it into CI

- Add a `test
axe-core Accessibility AutomationSkill

Automated accessibility testing with axe-core integrated into CI pipelines, including custom rule configuration, issue prioritization, and remediation guidance.

A/B Test ValidationSkill

Validating A/B test implementations including traffic splitting accuracy, statistical significance calculation, metric tracking, and experiment cleanup.

Accessibility A11y EnhancedSkill

Comprehensive WCAG compliance and accessibility testing covering ARIA, keyboard navigation, screen readers, color contrast, and automated a11y validation.

Accessibility AuditorSkill

Comprehensive WCAG 2.1 AA compliance testing combining automated axe-core scans with manual keyboard navigation, screen reader compatibility, and focus management verification

AFL++ Fuzzing TestingSkill

American Fuzzy Lop Plus Plus mutation-based fuzz testing for finding crashes, hangs, and security vulnerabilities in binary programs.

Agent Browser AutomationSkill

Fast Rust-based headless browser automation CLI with Node.js fallback for AI agents, featuring navigation, clicking, typing, snapshots, and structured commands optimized for agent workflows.

Agentic Testing PatternsSkill

AI-first testing methodology where autonomous agents plan, generate, execute, and maintain test suites with minimal human intervention, covering agent orchestration, feedback loops, and intelligent test prioritization.

AI Agent EvaluationSkill

Comprehensive evaluation patterns for AI agents including multi-turn conversation testing, LLM-as-judge frameworks, benchmark suites, regression detection, and systematic eval pipelines for measuring agent quality and safety.