Skill125 estrellas del repoactualizado 1mo ago
strict-enforcement
Use when running claudikins-kernel:verify, checking implementation quality, deciding pass/fail verdicts, or enforcing cross-command gates — requires actual evidence of code working, not just passing tests
Instalar en Claude Code
Copiargit clone --depth 1 https://github.com/elb-pr/claudikins-kernel /tmp/strict-enforcement && cp -r /tmp/strict-enforcement/skills/strict-enforcement ~/.claude/skills/strict-enforcementDespués abre una sesión nueva de Claude Code; el skill carga automáticamente.
Definición
SKILL.md
# Strict Enforcement Verification Methodology
## When to use this skill
Use this skill when you need to:
- Run the `claudikins-kernel:verify` command
- Validate implementation before shipping
- Decide pass/fail verdicts
- Check code integrity after changes
- Enforce cross-command gates
## Core Philosophy
> "Evidence before assertions. Always." - Verification philosophy
Never claim code works without seeing it work. Tests passing is not enough. Claude must SEE the output.
### The Three Laws
1. **See it working** - Screenshots, curl responses, CLI output. Actual evidence.
2. **Human checkpoint** - No auto-shipping. Human reviews evidence and decides.
3. **Exit code 2 gates** - Verification failures block claudikins-kernel:ship. No exceptions.
## Verification Phases
### Phase 1: Automated Quality Checks
Run the automated checks first. Fast feedback.
| Check | Command Pattern | What It Catches |
| ----- | ------------------------------------ | ----------------------------------- |
| Tests | `npm test` / `pytest` / `cargo test` | Logic errors, regressions |
| Lint | `npm run lint` / `ruff` / `clippy` | Style issues, common bugs |
| Types | `tsc` / `mypy` / `cargo check` | Type mismatches, interface drift |
| Build | `npm run build` / `cargo build` | Compilation errors, bundling issues |
**Flaky Test Detection (C-12):**
```
Test fails?
├── Re-run failed tests
├── Pass 2nd time?
│ └── Yes → STOP: [Accept flakiness] [Fix tests] [Abort]
└── Fail 2nd time?
├── Run isolated
└── Still fail? → STOP: [Fix] [Skip] [Abort]
```
### Phase 2: Output Verification (catastrophiser)
This is the feedback loop that makes Claude's code actually work.
| Project Type | Verification Method | Evidence |
| ------------ | ------------------------------------ | ------------------------------ |
| Web app | Start server, screenshot, test flows | Screenshots, console logs |
| API | Curl endpoints, check responses | Status codes, response bodies |
| CLI | Run commands, capture output | stdout, stderr, exit codes |
| Library | Run examples, check results | Output values, test coverage |
| Service | Check logs, verify health endpoint | Log patterns, health responses |
**Fallback Hierarchy (A-3):**
If primary method unavailable, fall back:
1. Start server + screenshot (preferred for web)
2. Curl endpoints (preferred for API)
3. Run CLI commands (preferred for CLI)
4. Run tests only (fallback)
5. Code review only (last resort)
**Timeout:** 30 seconds per verification method (CMD-30).
### Phase 3: Code Simplification (Optional)
After verification passes, optionally run cynic for polish.
**Prerequisites:**
- Phase 2 (catastrophiser) must PASS
- Human approves: "Run cynic for polish pass?"
**cynic Rules:**
- Preserve exact behaviour (tests MUST still pass)
- Remove unnecessary abstraction
- Improve naming clarity
- Delete dead code
- Flatten nested conditionals
**If tests fail after simplification:**
- Log failure reasons
- Show human
- Proceed anyway (A-5) with caveat
See [cynic-rollback.md](references/cynic-rollback.md) for recovery patterns.
### Phase 4: Klaus Escalation
If stuck during verification:
```
Is mcp__claudikins-klaus available? (E-16)
├── No →
│ Offer: [Manual review] [Ask Claude differently] (E-17)
│ Fallback: [Accept with uncertainty] [Max retries, abort] (E-18)
└── Yes →
Spawn klaus via SubagentStop hook
```
### Phase 5: Human Checkpoint
The final gate. Present comprehensive evidence.
```
Verification Report
-------------------
Tests: ✓ 47/47 passed
Lint: ✓ 0 issues
Types: ✓ 0 errors
Build: ✓ success
Evidence:
- Screenshot: .claude/evidence/login-flow.png
- API test: POST /api/auth → 200 OK
- CLI test: mycli --help → exit 0
[Ready to Ship] [Needs Work] [Accept with Caveats]
```
Human decides. If approved, set `unlock_ship = true`.
## Rationalizations to Resist
Agents under pressure find excuses. These are all violations:
| Excuse | Reality |
| ------------------------------------------ | --------------------------------------------------------------------- |
| "Tests pass, that's good enough" | Tests aren't enough. SEE it working. Screenshots, curl, output. |
| "I'll verify after shipping" | Verify BEFORE ship. That's the whole point. |
| "The type checker caught everything" | Types don't catch runtime issues. Get evidence. |
| "Screenshot failed but it probably works" | "Probably" isn't evidence. Fix the screenshot or use fallback. |
| "Human checkpoint is just a formality" | Human checkpoint is the gate. No auto-shipping. |
| "Code review is enough for this change" | Code review is last resort fallback. Try harder. |
| "Tests are flaky, I'll ignore the failure" | Flaky tests hide real failures. Fix or explicitly accept with caveat. |
| "Exit code 2 is too strict" | Exit code 2 exists to block bad ships. Pass properly. |
**All of these mean: Get evidence. Human decides. No shortcuts.**
## Red Flags — STOP and Reassess
If you're thinking any of these, you're about to violate the methodology:
- "It should work because..."
- "The tests pass so..."
- "I'm confident that..."
- "It worked before..."
- "The types check so..."
- "I'll just skip verification this once"
- "Human will approve anyway"
- "Evidence isn't necessary for this change"
**All of these mean: STOP. Get evidence. Present to human. Let them decide.**
## Exit Code 2 Pattern (CRITICAL)
The verify-gate.sh hook enforces the gate:
```bash
# Both conditions MUST be true
ALL_PASSED=$(jq -r '.all_checks_passed' "$STATE")
HUMAN_APPROVED=$(j