Skill24.5k repo starsupdated 24d ago

intended-vs-implemented

This skill provides a structured method for auditing the gap between documented system behavior and actual code implementation, focusing on security and correctness bugs that automated linters miss. Use it when reviewing AI-generated code, validating access controls against documented permissions, or verifying that codebases match their own documentation by establishing intent from written specifications, gathering implementation evidence through code citation, comparing claims to enforcement points, and classifying mismatches by real-world impact rather than cosmetic drift.

View source Repository: pm-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/phuryn/pm-skills /tmp/intended-vs-implemented && cp -r /tmp/intended-vs-implemented/pm-ai-shipping/skills/intended-vs-implemented ~/.claude/skills/intended-vs-implemented

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Intended vs. Implemented: Auditing the Gap

## Purpose

A linter scans code in a vacuum. It can tell you the code is *internally* consistent; it cannot tell you the code does what you *meant*, because it has no model of your intent. The highest-value security and correctness bugs live in that gap — a permission documented but never enforced, a "cron-only" endpoint anyone can call, a field marked public-only that leaks private data.

This skill is the method for finding that gap. It is the differentiator: it only works when intent has been written down first (see the **shipping-artifacts** skill), and that's exactly why commodity tools can't replicate it.

## Context

Use this when documented intent exists — `permissions.md`, `architecture.md`, `variables.md`, etc. If those docs are absent or stale, that absence is itself the first finding: you cannot audit intent you never recorded. Recommend documenting first, then auditing.

## Method

1. **Establish intent.** Read the `/documentation/*.md` set as the source of truth for what *should* be true: who may access what, which boundaries are trusted, which data is public. Treat the docs as claims to verify, not as proof.

2. **Gather implementation evidence.** Read the code that enforces (or fails to enforce) each claim. Evidence is a cited file and line — the actual authorization check, the actual query filter, the actual sanitizer. "It's probably handled upstream" is not evidence; the code path is.

3. **Compare claim to code, one boundary at a time.** For each documented rule, ask: does an enforcement point actually implement it, on the server, on every path? Distrust comments like "internal only," "admin only," or "validated elsewhere" — verify them in code.

4. **Classify each mismatch by whether it matters.** A mismatch matters when crossing it lets a real actor reach data, money, infrastructure, or another tenant they shouldn't. It does not matter when the only person affected is the actor themselves on their own data. Drop cosmetic drift; keep boundary-crossing drift.

5. **Avoid hand-wavy findings.** Every finding names: the **documented intent** (quote the doc), the **implemented reality** (cite the code), the **attacker and victim**, and the **concrete fix**. If you cannot cite both sides of the gap, it is a question to investigate, not a finding to report.

## What counts

- **Intent:** a documented rule, boundary, scope, or public/private classification.
- **Implementation evidence:** a cited enforcement point (or its provable absence) in the code.
- **A mismatch that matters:** doc says one thing, code does another, and the difference crosses a trust, cost, data, or tenant boundary.

## Notes

- Documented-but-unenforced is a finding on its own — rank it by what crossing the gap exposes.
- Undocumented-but-enforced is usually fine, but flag it: the docs are now stale, which weakens the next audit.
- This method feeds the security and performance audits; it does not replace their sink-level analysis — it adds the intent axis they lack.
- Never fabricate intent to manufacture a gap. If the docs are silent, say the docs are silent.

More from this repository

shipping-artifactsSkill

The durable documentation set that makes an AI-built (vibe-coded) app reviewable before shipping. A small core every app needs — architecture, user/permission flows, permissions, variables/secrets, and a test-coverage map — plus conditional docs added only when they apply: emails, scheduled work, SEO, and embedded agents/automation. Defines what each doc must capture and how a reviewer or auditor uses it. Use when documenting a codebase for handoff, mapping user journeys and trust-boundary crossings, planning test coverage, or preparing for a security or performance audit.

ab-test-analysisSkill

Analyze A/B test results with statistical significance, sample size validation, confidence intervals, and ship/extend/stop recommendations. Use when evaluating experiment results, checking if a test reached significance, interpreting split test data, or deciding whether to ship a variant.

cohort-analysisSkill

Perform cohort analysis on user engagement data — retention curves, feature adoption trends, and segment-level insights. Use when analyzing user retention by cohort, studying feature adoption over time, investigating churn patterns, or identifying engagement trends.

sql-queriesSkill

Generate SQL queries from natural language descriptions. Supports BigQuery, PostgreSQL, MySQL, and other dialects. Reads database schemas from uploaded diagrams or documentation. Use when writing SQL, building data reports, exploring databases, or translating business questions into queries.

brainstorm-okrsSkill

Brainstorm team-level OKRs aligned with company objectives — qualitative objectives with measurable key results. Use when setting quarterly OKRs, aligning team goals with company strategy, drafting objectives, or learning how to write effective OKRs.

create-prdSkill

Create a Product Requirements Document using a comprehensive 8-section template covering problem, objectives, segments, value propositions, solution, and release planning. Use when writing a PRD, documenting product requirements, preparing a feature spec, or reviewing an existing PRD.

dummy-datasetSkill

Generate realistic dummy datasets for testing with customizable columns, constraints, and output formats (CSV, JSON, SQL, Python script). Use when creating test data, building mock datasets, or generating sample data for development and demos.

job-storiesSkill

Create job stories using the 'When [situation], I want to [motivation], so I can [outcome]' format with detailed acceptance criteria. Use when writing job stories, creating JTBD-style backlog items, or expressing user situations and motivations.