Skip to main content
ClaudeWave
Skill282 repo starsupdated yesterday

pentest-whitebox-code-review

Pentest Whitebox Code Review performs systematic white-box source code security audits using backward taint analysis to trace dangerous code paths from sinks to user-controlled sources. It classifies injection contexts by slot type, verifies XSS render contexts, and produces a prioritized exploitation queue. Use this skill when conducting authorized security assessments of source code to identify injection vulnerabilities, authentication weaknesses, and other exploitable security flaws.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/jd-opensource/JoySafeter /tmp/pentest-whitebox-code-review && cp -r /tmp/pentest-whitebox-code-review/skills/pentest-whitebox-code-review ~/.claude/skills/pentest-whitebox-code-review
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Pentest Whitebox Code Review

## Purpose
Perform systematic white-box source code security audit using Shannon's backward taint analysis methodology. Traces from dangerous sinks back to user-controlled sources, classifies injection contexts by slot type, verifies XSS render contexts, and produces a prioritized exploitation queue for downstream proof-driven exploitation.

## Prerequisites

### Authorization Requirements
- **Written authorization** with explicit scope for source code review
- **Source code access** — full repository with version control history
- **Architecture documentation** if available (data flow diagrams, API specs)
- **Deployment configuration** access (environment variables, secrets management)

### Environment Setup
- semgrep with custom rules for taint analysis
- CodeQL database built for target language
- ripgrep for fast pattern searching
- jadx for Android APK decompilation (if applicable)
- Source map extraction tools for minified JavaScript
- AST parsing tools for target language (tree-sitter, babel, etc.)

## Core Workflow

### Phase 1: Discovery
1. **Architecture Mapping**: Identify application layers (routing, controllers, services, data access, templates). Map data flow from HTTP entry points through business logic to database/file/external sinks.
2. **Entry Point Enumeration**: Catalog all user-controlled input sources — HTTP parameters, headers, cookies, file uploads, WebSocket messages, environment variables, database reads of user-stored data.
3. **Security Pattern Inventory**: Identify existing security controls — input validation functions, output encoding helpers, parameterized query patterns, CSRF protections, authentication middleware, rate limiters.

### Phase 2: Vulnerability Analysis (5 Parallel Tracks)
4. **Injection Sink Hunting**: Backward taint from SQL/command/file/template sinks to sources. Classify each sink by slot type: SQL-val, SQL-ident, CMD-argument, FILE-path, TEMPLATE-expr. Verify whether parameterization or sanitization breaks the taint chain.
5. **XSS Render Context Analysis**: Identify all dynamic output points in templates/responses. Classify each by render context: HTML_BODY, HTML_ATTRIBUTE, JAVASCRIPT_STRING, URL_PARAM, CSS_VALUE. Verify context-appropriate encoding is applied at each output point.
6. **Authentication Checklist (9-point)**: Transport security, rate limiting, session management, token properties, session fixation resistance, password policy enforcement, login response uniformity, account recovery security, SSO/OAuth implementation.
7. **Authorization Model Review (3-type)**: Horizontal (same-role cross-user access), vertical (privilege escalation across roles), context-workflow (state-dependent authorization bypass).
8. **SSRF Sink Hunting**: Identify all outbound request sinks. Classify by type: classic (direct URL), blind (no response), semi-blind (partial response), stored (deferred execution). Trace URL construction from user input to request dispatch.

### Phase 3: Synthesis
9. **Confidence Scoring & Exploitation Queue**: Score each finding by taint chain completeness, sanitization bypass likelihood, and impact severity. Generate exploitation queue JSON for downstream exploit validation.

## Slot Type Classification

| Slot Type | Sink Pattern | Sanitization Required |
|-----------|-------------|----------------------|
| SQL-val | Query parameter value position | Parameterized query / prepared statement |
| SQL-ident | Table name, column name, ORDER BY | Allowlist validation |
| CMD-argument | Shell command argument | Argument escaping + allowlist |
| FILE-path | File read/write path construction | Path canonicalization + allowlist |
| TEMPLATE-expr | Template engine expression | Context-aware auto-escaping |

## Render Context Classification

| Context | Output Location | Encoding Required |
|---------|----------------|-------------------|
| HTML_BODY | Between HTML tags | HTML entity encoding |
| HTML_ATTRIBUTE | Inside attribute values | Attribute encoding + quoting |
| JAVASCRIPT_STRING | Inside JS string literals | JavaScript Unicode escaping |
| URL_PARAM | URL query parameter values | URL percent encoding |
| CSS_VALUE | Inside CSS property values | CSS hex encoding |

## Tool Categories

| Category | Tools | Purpose |
|----------|-------|---------|
| Taint Analysis | semgrep, CodeQL | Automated sink-to-source taint tracing |
| Pattern Search | ripgrep, ast-grep | Fast code pattern matching |
| Decompilation | jadx, sourcemap-extract | Recover source from compiled artifacts |
| AST Parsing | tree-sitter, babel | Language-aware code structure analysis |
| Dependency Audit | npm audit, pip-audit, snyk | Known vulnerability detection |

## References
- `references/tools.md` - Tool function signatures and parameters
- `references/workflows.md` - Taint analysis workflows and vulnerability patterns