Skill139 estrellas del repoactualizado 5mo ago

counterexample-explainer

This Claude Code skill analyzes counterexamples that violate formal or informal specifications by examining temporal logic, invariants, preconditions, postconditions, and test assertions. Use it to debug test failures, understand model checker output, explain runtime assertion violations, or teach specification concepts by producing step-by-step execution traces showing state changes, root causes, and impact assessment.

Ver fuente Repositorio: Skills-4-SE

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/ArabelaTso/Skills-4-SE /tmp/counterexample-explainer && cp -r /tmp/counterexample-explainer/skills/counterexample-explainer ~/.claude/skills/counterexample-explainer

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Counterexample Explainer

## Overview

Analyze counterexamples that violate specifications and produce clear, structured explanations showing step-by-step how and why the violation occurs, with root cause analysis and impact assessment.

## Workflow

### 1. Understand the Specification

Identify what property is being checked.

**Questions to ask:**
- What is the specification or requirement?
- Is it formal (invariant, temporal logic) or informal (requirement doc)?
- What should happen vs what actually happened?
- Is this from a test failure, model checker, or runtime error?

See [specification-types.md](references/specification-types.md) for comprehensive specification catalog.

**Common specification types:**

**Formal specifications:**
- Invariants: `balance >= 0`
- Temporal logic: `G(request → F grant)`
- Pre/postconditions: `@requires(x > 0)`, `@ensures(result >= 0)`
- State machines: Valid state transitions
- Concurrency: Atomicity, deadlock freedom

**Informal requirements:**
- User stories with acceptance criteria
- Functional requirements
- API contracts
- Expected behavior descriptions

**Test specifications:**
- Assertions: `assert result == expected`
- Property-based tests
- Integration test expectations

### 2. Collect Counterexample Information

Gather all relevant data about the violation.

**From test failures:**
```bash
# Run test to get failure details
pytest test_file.py::test_name -v

# Get stack trace
pytest test_file.py::test_name -v --tb=long

# Get variable values at failure
pytest test_file.py::test_name -v -l
```

**Information to extract:**
- Input values that triggered failure
- Expected output/behavior
- Actual output/behavior
- Intermediate state changes
- Stack trace or execution path
- Error messages

**From runtime violations:**
```python
# Assertion failure
AssertionError: balance must be non-negative
  File "account.py", line 45
  assert self.balance >= 0

# Extract:
# - Variable: self.balance
# - Expected: >= 0
# - Actual: (check value)
```

**From model checkers:**
```
Counterexample trace:
State 0: request=false, grant=false
State 1: request=true, grant=false
State 2: request=true, grant=false
...
State 50: request=true, grant=false (VIOLATION)

Property violated: G(request → F grant)
```

### 3. Identify Violation Point

Pinpoint exactly where and when specification is broken.

**For invariants:**
- Find the operation that breaks the invariant
- Identify the state transition that causes violation
- Note variable values before and after

**For temporal properties:**
- Identify the state where property fails
- Trace back to find root cause
- Show path through states

**For assertions:**
- Locate the assertion that fails
- Check values of variables in assertion
- Find operation that produced wrong value

**Example analysis:**

```python
# Specification
assert balance >= 0  # Invariant

# Counterexample
initial_balance = 100
withdraw(150)
# balance is now -50

# Violation point: withdraw operation
# Before: balance = 100 (satisfies invariant)
# After: balance = -50 (violates invariant)
```

### 4. Generate Step-by-Step Trace

Show execution path leading to violation.

See [explanation-patterns.md](references/explanation-patterns.md) for detailed patterns.

**Trace structure:**

```markdown
## Counterexample Trace

**Specification:** [What property should hold]

### Initial State
- [variable1]: [value]
- [variable2]: [value]
- Status: ✅ Satisfies specification

### Step 1: [Operation/Event]
**Action:** [What happened]

**State Changes:**
- [variable]: [old_value] → [new_value]

**Status:** ✅ Still satisfies specification

**Why this matters:** [Explanation of significance]

### Step 2: [Operation/Event]
**Action:** [What happened]

**State Changes:**
- [variable]: [old_value] → [new_value]

**Status:** ❌ VIOLATES specification

**Violation Details:**
- Expected: [what should be true]
- Actual: [what is actually true]
- Violated property: [specific clause/condition]

**Why it violates:** [Clear explanation]

### Final State
- [variable1]: [value]
- [variable2]: [value]
- Status: ❌ Specification violated
```

**Example:**

```markdown
## Balance Invariant Violation

**Specification:** Account balance must remain non-negative (`balance >= 0`)

### Initial State
- account.balance: 100
- Status: ✅ Satisfies invariant (100 >= 0)

### Step 1: User initiates withdrawal
**Action:** `withdraw(150)` called

**Validation Check:**
- Amount to withdraw: 150
- Current balance: 100
- Sufficient funds? NO (150 > 100)

**Expected behavior:** Reject withdrawal, raise InsufficientFundsError

**Actual behavior:** Withdrawal proceeds (bug: no validation)

### Step 2: System processes withdrawal
**Action:** Balance updated

**State Changes:**
- account.balance: 100 → -50

**Status:** ❌ VIOLATES INVARIANT

**Violation Details:**
- Expected: balance >= 0
- Actual: balance = -50
- Violated property: balance >= 0 (non-negativity)

**Why it violates:**
-50 is NOT >= 0. The balance has gone negative, which violates the core
invariant that account balances must never be negative.

### Final State
- account.balance: -50
- Status: ❌ Overdraft occurred

### Root Cause
Missing validation in `withdraw` method:

```python
def withdraw(self, amount):
    # BUG: No check if amount > balance
    self.balance -= amount  # This can go negative!

# Should be:
def withdraw(self, amount):
    if amount > self.balance:
        raise InsufficientFundsError(f"Cannot withdraw {amount}, balance is {self.balance}")
    self.balance -= amount
```
```

### 5. Compare Expected vs Actual

Show side-by-side what should happen vs what happened.

```markdown
## Expected vs Actual Behavior

| Aspect | Expected (Specification) | Actual (Counterexample) | Match? |
|--------|-------------------------|-------------------------|--------|
| [Property 1] | [Expected value] | [Actual value] | ✅/❌ |
| [Property 2] | [Expected value] | [Actual value] | ✅/❌ |
| [Property 3] | [Expected value] | [Actual value] | ✅/❌ |

Del mismo repositorio

abstract-domain-explorerSkill

Applies abstract interpretation using different abstract domains (intervals, octagons, polyhedra, sign, congruence) to statically analyze program variables and infer invariants, value ranges, and relationships. Use when analyzing program properties, inferring loop invariants, detecting potential errors, or understanding variable relationships through static analysis.

abstract-invariant-generatorSkill

Uses abstract interpretation to automatically infer loop invariants, function preconditions, and postconditions for formal verification. Generates invariants that capture program behavior and support correctness proofs in Dafny, Isabelle, Coq, and other verification systems. Use when adding formal specifications to code, generating verification conditions, inferring contracts for functions, or discovering loop invariants for proofs.

abstract-state-analyzerSkill

Performs abstract interpretation over source code to infer possible program states, variable ranges, and data properties without executing the program. Reports potential runtime errors including out-of-bounds accesses, null dereferences, type inconsistencies, division by zero, and integer overflows. Use when analyzing code for potential runtime errors, performing static analysis, checking safety properties, or verifying program behavior without execution.

abstract-trace-summarizerSkill

Performs abstract interpretation to produce summarized execution traces and high-level program behavior representations. Highlights key control flow paths, variable relationships, loop invariants, function summaries, and potential runtime states using abstract domains (intervals, signs, nullness, etc.). Use when analyzing program behavior, understanding execution paths, computing loop invariants, tracking variable ranges, detecting potential runtime errors, or generating program summaries without concrete execution.

acsl-annotation-assistantSkill

Create ACSL (ANSI/ISO C Specification Language) formal annotations for C/C++ programs. Use this skill when working with formal verification, adding function contracts (requires/ensures), loop invariants, assertions, memory safety annotations, or any ACSL specifications. Supports Frama-C verification and generates comprehensive formal specifications for C/C++ code.

agent-browserSkill

CLI-based browser automation with persistent page state using ref-based element interaction. Use when users ask to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

ambiguity-detectorSkill

Detects and analyzes ambiguous language in software requirements and user stories. Use when reviewing requirements documents, user stories, specifications, or any software requirement text to identify vague quantifiers, unclear scope, undefined terms, missing edge cases, subjective language, and incomplete specifications. Provides detailed analysis with clarifying questions and suggested improvements.

api-design-assistantSkill

Design and review APIs with suggestions for endpoints, parameters, return types, and best practices. Use when designing new APIs from requirements, reviewing existing API designs, generating API documentation, or getting implementation guidance. Supports REST APIs with focus on endpoint structure, request/response schemas, authentication, pagination, filtering, versioning, and OpenAPI specifications. Triggers when users ask to design, review, document, or improve APIs.