git clone --depth 1 https://github.com/oaustegard/claude-skills /tmp/featuring && cp -r /tmp/featuring/featuring ~/.claude/skills/featuringSKILL.md
# Featuring
Generate `_FEATURES.md` files — top-down documentation of what a codebase **does**,
organized by feature/capability, anchored to specific source symbols.
**tree-sitting** tells you WHAT symbols exist.
**_FEATURES.md** tells you WHY they exist and what they accomplish together.
For large codebases, the root `_FEATURES.md` decomposes into sub-feature files
linked by capability area — not by folder structure. An agent starts at the root
and is drawn into sub-files only when working on a relevant area.
## Dependency
Requires **tree-sitting** skill. Uses its engine for AST scanning.
```bash
uv venv /home/claude/.venv 2>/dev/null
uv pip install tree-sitter-language-pack --python /home/claude/.venv/bin/python
```
For quick structural orientation before running gather.py, use tree-sitting's CLI:
```bash
TREESIT=/mnt/skills/user/tree-sitting/scripts/treesit.py
# Complete tree, sparse detail — see the full shape
/home/claude/.venv/bin/python $TREESIT /path/to/repo --depth=-1 --detail=sparse
```
## Workflow: Multi-Pass Synthesis
Feature documentation is built in three passes. The overview is written LAST,
after all features are understood — not first.
### Pass 1: Orientation (quick scan)
```bash
/home/claude/.venv/bin/python /mnt/skills/user/featuring/scripts/gather.py /path/to/repo \
--skip tests,.github,node_modules --source-budget 8000
```
Read the gather output. Before writing anything, form a hypothesis:
> "This codebase appears to be a **[what it is]** that provides **[capability A]**,
> **[capability B]**, and **[capability C]**."
Write this down as a DRAFT overview. It will be wrong or incomplete — that's fine.
The point is to orient before diving into detail.
**How to identify capability areas:**
1. **What can a user/consumer DO with this?** (commands, API endpoints, UI actions)
2. **What problems does it solve?** (the WHY behind the code)
3. **What are the main workflows?** (how features compose)
4. **What are the constraints/invariants?** (rules the code enforces)
### Pass 2: Detailed feature extraction
For each capability area identified in Pass 1:
1. Gather the symbols that implement it (from gather output + targeted `get_source()`)
2. Understand how they collaborate — the workflow
3. Identify constraints and invariants
4. Write the feature section
During this pass, you'll discover:
- Capabilities you missed in Pass 1
- Features that are more complex than expected (decomposition candidates)
- Features that are simpler than expected (merge candidates)
- Cross-cutting concerns that span multiple capability areas
**Hierarchy decision** (per feature, during this pass):
| Signal | Action |
|--------|--------|
| ≤6 key symbols, self-contained | Inline in root `_FEATURES.md` |
| >6 key symbols OR clear sub-capabilities | Own `_FEATURES.md` sub-file |
| Spans many files but is ONE capability | Inline (breadth ≠ complexity) |
| Has sub-features that are independently useful | Own sub-file |
| Is infrastructure (logging, DB layer) | Inline briefly, unless it IS the product |
### Pass 3: Overview rewrite
NOW — after all features are documented — rewrite the overview. The Pass 1
draft was a hypothesis. Pass 3 replaces it with a proper progressive-disclosure
overview that:
1. States what the codebase is in one sentence
2. Lists the top-level capability areas (3-8 items)
3. For each area that has a sub-file: one sentence + link + "read when" guidance
4. For inline features: just the list entry (detail is below in the same file)
This is the most important part. The overview IS the entry point for every
agent session. It must be accurate, complete, and fast to scan.
## _FEATURES.md Format
### Root file
```markdown
# Features: {project-name}
> One-sentence description of what this codebase is and does.
**Capability areas:**
- **[Area A]** — one-sentence summary
- **[Area B]** — one-sentence summary → [details](path/to/_FEATURES.md)
- **[Area C]** — one-sentence summary
## {Inline Feature Name}
{2-3 sentences: what this feature does from a user perspective.}
**Key symbols:**
- `file.py#function_name` — role in this feature
- `file.py#ClassName` — role in this feature
**Workflow:** {How a user exercises this feature or how symbols collaborate.}
**Constraints:** {Invariants, limits, rules.}
---
## {Complex Feature Area}
> One-sentence summary of what this area covers.
This area is documented in detail in [{area-name}/_FEATURES.md]({path}).
Read it when working on {specific trigger — e.g., "the memory retrieval pipeline",
"adding a new API endpoint", "modifying the build system"}.
At a glance, this area provides:
- {sub-capability 1} — one line
- {sub-capability 2} — one line
- {sub-capability 3} — one line
```
### Sub-feature files
Sub-feature files follow the SAME format as the root, recursively. They can
contain inline features and further sub-file references. Each sub-file:
- Has its own `# Features: {area-name}` header
- Has its own overview paragraph
- Is self-contained — an agent reading only this file understands the area
- Links back to the root: `← [Root features](../_FEATURES.md)`
### Format rules
- **Organized by capability**, not by file/directory
- **Symbol references** use `file#symbol` notation (relative to repo root)
- **Leading paragraph** per feature: what a user gets, not implementation details
- **Key symbols**: the 2-6 most important symbols, with their role explained
- **Workflow**: how the feature works end-to-end (include when non-obvious)
- **Constraints**: rules/invariants (include when they exist)
- **No source code** in _FEATURES.md — it's a map, not a mirror
- **"Read when" guidance** on every sub-file link — tells agents WHEN to drill in
### What makes a good feature entry
Good: "**Memory Storage** — Persist observations across sessions. Stores typed,
tagged memories to a Turso database with BM25 full-text search. Memories have
priority levels that affect retrieval ranking."
Bad: "**memory.py** — Contains `remGitHub repository access in containerized environments using REST API and credential detection. Use when git clone fails, or when accessing private repos/writing files via API.
Securely manages API credentials for multiple providers (Anthropic Claude, Google Gemini, GitHub). Use when skills need to access stored API keys for external service invocations.
Guidance for asking clarifying questions when user requests are ambiguous, have multiple valid approaches, or require critical decisions. Use when implementation choices exist that could significantly affect outcomes.
>-
>-
Browse Bluesky content via API and firehose - search posts, fetch user activity, sample trending topics, read feeds and lists, analyze and categorize accounts. Supports authenticated access for personalized feeds. Use for Bluesky research, user monitoring, trend analysis, feed reading, firehose sampling, account categorization.
Generate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".
Analyze and categorize Bluesky accounts by topic using keyword extraction. Use when users mention Bluesky account analysis, following/follower lists, topic discovery, account curation, or network analysis.