counterexample-explainer
Explain why counterexamples violate specifications by analyzing formal specifications (temporal logic, invariants, pre/postconditions, code contracts), informal requirements (user stories, acceptance criteria), test specifications (assertions, property-based tests), and providing step-by-step traces showing state changes, comparing expected vs actual behavior, identifying root causes, and assessing violation impact. Use when debugging test failures, understanding model checker output, explaining runtime assertion violations, analyzing static analysis warnings, or teaching specification concepts. Produces structured markdown explanations with traces, comparisons, state diagrams, and cause chains. Triggers when users ask why something failed, explain a violation, understand a counterexample, debug a specification, or analyze why a test fails.
git clone --depth 1 https://github.com/ArabelaTso/Skills-4-SE /tmp/counterexample-explainer && cp -r /tmp/counterexample-explainer/skills/counterexample-explainer ~/.claude/skills/counterexample-explainerSKILL.md
# Counterexample Explainer
## Overview
Analyze counterexamples that violate specifications and produce clear, structured explanations showing step-by-step how and why the violation occurs, with root cause analysis and impact assessment.
## Workflow
### 1. Understand the Specification
Identify what property is being checked.
**Questions to ask:**
- What is the specification or requirement?
- Is it formal (invariant, temporal logic) or informal (requirement doc)?
- What should happen vs what actually happened?
- Is this from a test failure, model checker, or runtime error?
See [specification-types.md](references/specification-types.md) for comprehensive specification catalog.
**Common specification types:**
**Formal specifications:**
- Invariants: `balance >= 0`
- Temporal logic: `G(request → F grant)`
- Pre/postconditions: `@requires(x > 0)`, `@ensures(result >= 0)`
- State machines: Valid state transitions
- Concurrency: Atomicity, deadlock freedom
**Informal requirements:**
- User stories with acceptance criteria
- Functional requirements
- API contracts
- Expected behavior descriptions
**Test specifications:**
- Assertions: `assert result == expected`
- Property-based tests
- Integration test expectations
### 2. Collect Counterexample Information
Gather all relevant data about the violation.
**From test failures:**
```bash
# Run test to get failure details
pytest test_file.py::test_name -v
# Get stack trace
pytest test_file.py::test_name -v --tb=long
# Get variable values at failure
pytest test_file.py::test_name -v -l
```
**Information to extract:**
- Input values that triggered failure
- Expected output/behavior
- Actual output/behavior
- Intermediate state changes
- Stack trace or execution path
- Error messages
**From runtime violations:**
```python
# Assertion failure
AssertionError: balance must be non-negative
File "account.py", line 45
assert self.balance >= 0
# Extract:
# - Variable: self.balance
# - Expected: >= 0
# - Actual: (check value)
```
**From model checkers:**
```
Counterexample trace:
State 0: request=false, grant=false
State 1: request=true, grant=false
State 2: request=true, grant=false
...
State 50: request=true, grant=false (VIOLATION)
Property violated: G(request → F grant)
```
### 3. Identify Violation Point
Pinpoint exactly where and when specification is broken.
**For invariants:**
- Find the operation that breaks the invariant
- Identify the state transition that causes violation
- Note variable values before and after
**For temporal properties:**
- Identify the state where property fails
- Trace back to find root cause
- Show path through states
**For assertions:**
- Locate the assertion that fails
- Check values of variables in assertion
- Find operation that produced wrong value
**Example analysis:**
```python
# Specification
assert balance >= 0 # Invariant
# Counterexample
initial_balance = 100
withdraw(150)
# balance is now -50
# Violation point: withdraw operation
# Before: balance = 100 (satisfies invariant)
# After: balance = -50 (violates invariant)
```
### 4. Generate Step-by-Step Trace
Show execution path leading to violation.
See [explanation-patterns.md](references/explanation-patterns.md) for detailed patterns.
**Trace structure:**
```markdown
## Counterexample Trace
**Specification:** [What property should hold]
### Initial State
- [variable1]: [value]
- [variable2]: [value]
- Status: ✅ Satisfies specification
### Step 1: [Operation/Event]
**Action:** [What happened]
**State Changes:**
- [variable]: [old_value] → [new_value]
**Status:** ✅ Still satisfies specification
**Why this matters:** [Explanation of significance]
### Step 2: [Operation/Event]
**Action:** [What happened]
**State Changes:**
- [variable]: [old_value] → [new_value]
**Status:** ❌ VIOLATES specification
**Violation Details:**
- Expected: [what should be true]
- Actual: [what is actually true]
- Violated property: [specific clause/condition]
**Why it violates:** [Clear explanation]
### Final State
- [variable1]: [value]
- [variable2]: [value]
- Status: ❌ Specification violated
```
**Example:**
```markdown
## Balance Invariant Violation
**Specification:** Account balance must remain non-negative (`balance >= 0`)
### Initial State
- account.balance: 100
- Status: ✅ Satisfies invariant (100 >= 0)
### Step 1: User initiates withdrawal
**Action:** `withdraw(150)` called
**Validation Check:**
- Amount to withdraw: 150
- Current balance: 100
- Sufficient funds? NO (150 > 100)
**Expected behavior:** Reject withdrawal, raise InsufficientFundsError
**Actual behavior:** Withdrawal proceeds (bug: no validation)
### Step 2: System processes withdrawal
**Action:** Balance updated
**State Changes:**
- account.balance: 100 → -50
**Status:** ❌ VIOLATES INVARIANT
**Violation Details:**
- Expected: balance >= 0
- Actual: balance = -50
- Violated property: balance >= 0 (non-negativity)
**Why it violates:**
-50 is NOT >= 0. The balance has gone negative, which violates the core
invariant that account balances must never be negative.
### Final State
- account.balance: -50
- Status: ❌ Overdraft occurred
### Root Cause
Missing validation in `withdraw` method:
```python
def withdraw(self, amount):
# BUG: No check if amount > balance
self.balance -= amount # This can go negative!
# Should be:
def withdraw(self, amount):
if amount > self.balance:
raise InsufficientFundsError(f"Cannot withdraw {amount}, balance is {self.balance}")
self.balance -= amount
```
```
### 5. Compare Expected vs Actual
Show side-by-side what should happen vs what happened.
```markdown
## Expected vs Actual Behavior
| Aspect | Expected (Specification) | Actual (Counterexample) | Match? |
|--------|-------------------------|-------------------------|--------|
| [Property 1] | [Expected value] | [Actual value] | ✅/❌ |
| [Property 2] | [Expected value] | [Actual value] | ✅/❌ |
| [Property 3] | [Expected value] | [Actual value] | ✅/❌ |Applies abstract interpretation using different abstract domains (intervals, octagons, polyhedra, sign, congruence) to statically analyze program variables and infer invariants, value ranges, and relationships. Use when analyzing program properties, inferring loop invariants, detecting potential errors, or understanding variable relationships through static analysis.
Uses abstract interpretation to automatically infer loop invariants, function preconditions, and postconditions for formal verification. Generates invariants that capture program behavior and support correctness proofs in Dafny, Isabelle, Coq, and other verification systems. Use when adding formal specifications to code, generating verification conditions, inferring contracts for functions, or discovering loop invariants for proofs.
Performs abstract interpretation over source code to infer possible program states, variable ranges, and data properties without executing the program. Reports potential runtime errors including out-of-bounds accesses, null dereferences, type inconsistencies, division by zero, and integer overflows. Use when analyzing code for potential runtime errors, performing static analysis, checking safety properties, or verifying program behavior without execution.
Performs abstract interpretation to produce summarized execution traces and high-level program behavior representations. Highlights key control flow paths, variable relationships, loop invariants, function summaries, and potential runtime states using abstract domains (intervals, signs, nullness, etc.). Use when analyzing program behavior, understanding execution paths, computing loop invariants, tracking variable ranges, detecting potential runtime errors, or generating program summaries without concrete execution.
Create ACSL (ANSI/ISO C Specification Language) formal annotations for C/C++ programs. Use this skill when working with formal verification, adding function contracts (requires/ensures), loop invariants, assertions, memory safety annotations, or any ACSL specifications. Supports Frama-C verification and generates comprehensive formal specifications for C/C++ code.
CLI-based browser automation with persistent page state using ref-based element interaction. Use when users ask to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
Detects and analyzes ambiguous language in software requirements and user stories. Use when reviewing requirements documents, user stories, specifications, or any software requirement text to identify vague quantifiers, unclear scope, undefined terms, missing edge cases, subjective language, and incomplete specifications. Provides detailed analysis with clarifying questions and suggested improvements.
Design and review APIs with suggestions for endpoints, parameters, return types, and best practices. Use when designing new APIs from requirements, reviewing existing API designs, generating API documentation, or getting implementation guidance. Supports REST APIs with focus on endpoint structure, request/response schemas, authentication, pagination, filtering, versioning, and OpenAPI specifications. Triggers when users ask to design, review, document, or improve APIs.