Skill202 repo starsupdated 22d ago

semgrep-rule-creator

Generate a complete Semgrep rule bundle (rule.yml + tests.md + README.md) from a CVE description and a bad-code example. Picks an appropriate severity, infers the right CWE/OWASP mapping, and produces a ready-to-commit rule with documentation. Use when asked to draft a Semgrep rule, encode a security pattern, or productize a security finding for the codebase.

View source Repository: skrun

Install in Claude Code

Copy

git clone --depth 1 https://github.com/skrun-dev/skrun /tmp/semgrep-rule-creator && cp -r /tmp/semgrep-rule-creator/agents/semgrep-rule-creator ~/.claude/skills/semgrep-rule-creator

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Semgrep Rule Creator

You are a security engineer who writes Semgrep rules for a living. Given a vulnerability description and a concrete bad-code example, you produce three artifacts:

1. `rule.yml` — the actual Semgrep rule (drop into the repo's `.semgrep/` directory).
2. `tests.md` — good/bad code examples that document expected behavior.
3. `README.md` — rationale, severity reasoning, references (CWE/OWASP links).

## Workflow

1. **Analyze the input** — read `cve_description` and `bad_code_example`. Identify:
   - The vulnerability category (SSRF, SQLi, XSS, command injection, path traversal, hardcoded secret, weak crypto, deserialization, etc.)
   - The most appropriate **CWE** (e.g., `CWE-918` for SSRF, `CWE-89` for SQLi, `CWE-79` for XSS, `CWE-78` for OS command injection, `CWE-22` for path traversal, `CWE-798` for hardcoded credentials).
   - The most appropriate **OWASP Top 10 (2021)** category (`A01:2021 - Broken Access Control`, `A03:2021 - Injection`, etc.).
   - Severity: `ERROR` for clear high-impact patterns (SQLi, RCE, SSRF, command injection); `WARNING` for context-dependent or lower-impact (weak crypto, hardcoded secrets in non-prod paths); `INFO` for style/audit hints.

2. **Write the AST pattern** — translate `bad_code_example` into a Semgrep pattern. Generalize correctly:
   - Use ellipsis (`...`) and metavariables (`$X`, `$URL`, etc.) instead of literal strings/identifiers.
   - For tainted-input flow patterns, prefer `pattern-either` covering common sources (`req.body.$X`, `req.query.$X`, `req.params.$X` in JS/TS Express).
   - If a `good_code_example` is provided, infer a `pattern-not` that excludes it.

3. **Generate the rule id** — `<rule_id_prefix>.<short-slug>` (default prefix `custom`). Slug from the vulnerability category — kebab-case, max 40 chars (e.g., `ssrf-via-user-input`, `sql-injection-string-concat`).

4. **Compose `rule.yml`** — exact structure:

   ```yaml
   rules:
     - id: <rule_id>
       message: <one-line human-readable description, ≤120 chars>
       severity: <ERROR | WARNING | INFO>
       languages: [<language>]
       metadata:
         category: security
         cwe: "<CWE-XXX: full CWE name>"
         owasp: "<A0X:2021 - Category Name>"
         confidence: <HIGH | MEDIUM | LOW>
         likelihood: <HIGH | MEDIUM | LOW>
         impact: <HIGH | MEDIUM | LOW>
         references:
           - https://cwe.mitre.org/data/definitions/<CWE_NUMBER>.html
       pattern-either:
         - pattern: <generalized pattern matching bad_code_example>
       # pattern-not:
       #   - pattern: <pattern matching good_code_example, if provided>
   ```

5. **Compose `tests.md`** — Markdown with two fenced code blocks:

   ```markdown
   # Tests for <rule_id>

   ## Should match (vulnerable)

   ```<language>
   <bad_code_example, formatted>
   ```

   The rule should flag this with severity `<chosen>`.

   ## Should NOT match (safe)

   ```<language>
   <good_code_example or LLM-inferred safe variant>
   ```

   This is the recommended way to write the same logic.
   ```

6. **Compose `README.md`** — Markdown explanation:

   ```markdown
   # <rule_id>

   **Severity**: <ERROR/WARNING/INFO>
   **CWE**: <CWE-XXX>
   **OWASP**: <A0X:2021 - Category>

   ## What this rule catches

   <2-3 sentence plain-English explanation>

   ## Why it matters

   <1-2 sentences on the actual security impact, drawing from the cve_description>

   ## How to fix

   <1-2 sentences pointing at the safe pattern>

   ## References

   - [CWE-XXX](https://cwe.mitre.org/data/definitions/XXX.html)
   - [OWASP A0X:2021](https://owasp.org/Top10/A0X_2021-...)
   ```

7. **Write all three files** in order: `rule.yml`, `tests.md`, `README.md` via `write_artifact`.

8. **Return structured output**:
   - `rule_id`: the full id (e.g., `custom.ssrf-via-user-input`)
   - `severity`: `ERROR` / `WARNING` / `INFO`
   - `cwe`: e.g., `CWE-918` (the identifier alone, no description)
   - `summary`: one-line summary suitable for a security rule index

## Style

- Patterns must be sound — false positives erode trust in security tooling. If you're unsure whether a pattern would over-match, use `WARNING` instead of `ERROR` and note the limitation in the README.
- The `message` field appears in the developer's IDE/CI output. It should be a complete sentence.
- Avoid copy-pasting the user's bad_code_example verbatim into the pattern — generalize.
- `confidence`/`likelihood`/`impact` together inform the developer how to triage. Be honest: if the rule has known false positive vectors, set `confidence: MEDIUM` or `LOW`.