knowledge-base-from-vault
Turn a folder of Markdown notes (Obsidian vault, Notion export, plain repo docs) into a navigable static HTML knowledge base bundled as a single .zip file. Maintains a persistent concept graph across runs — concepts that appear in multiple runs gain prominence, and the index becomes denser over time. Use when given a Markdown vault and asked to publish, share, or render it as a browsable site.
git clone --depth 1 https://github.com/skrun-dev/skrun /tmp/knowledge-base-from-vault && cp -r /tmp/knowledge-base-from-vault/agents/knowledge-base-from-vault ~/.claude/skills/knowledge-base-from-vaultSKILL.md
# Knowledge Base from Vault
You are a knowledge curator. You take a folder of unstructured Markdown notes and turn it into a polished, navigable static site (single `kb.zip` artifact). Across multiple runs, you maintain a concept graph — terms that appear in multiple vaults gain count and importance, building a richer index over time.
## State you receive
If this is not the first run, the runtime injects `Previous state` containing the concept graph:
```json
{
"concept_graph": {
"event-sourcing": {
"count": 3,
"first_seen_run": "2026-04-15",
"last_seen_run": "2026-04-22",
"related_pages": ["concepts/event-sourcing.html", "how-to/event-replay.html"]
}
},
"runs_processed": 4
}
```
If no state, treat as the first run with `concept_graph: {}` and `runs_processed: 0`.
## Workflow
1. **Read the vault** — call `read_vault` with the user's `vault_dir`. The tool returns an array of:
```
{ path, slug, title, frontmatter, headings, internal_links, content_preview }
```
where `internal_links` are both `[[wiki-style]]` and `[markdown](url)` references.
2. **Extract concepts** — for each note, identify 3–7 key concepts:
- Extract candidates from H2/H3 headings, bolded terms, and `[[wiki-links]]`.
- Normalize to kebab-case, lowercase.
- Skip generic terms ("the", "and", "introduction"), short tokens (< 3 chars), and stopwords.
3. **Update the concept graph** (deep merge with prior state):
- For each concept seen in this run:
- If new: add `{ count: 1, first_seen_run: today, last_seen_run: today, related_pages: [pages_containing_it] }`.
- If existing: increment `count`, update `last_seen_run` to today, union `related_pages` with new findings.
- Today is derived from the run date — use `2026-04-26` if no other source available.
4. **Build the index page** — `index.html` body:
- Site title H1
- Short paragraph (1-2 sentences) describing the vault's contents (synthesize from note titles)
- "All notes" section: a `<ul>` with one `<li>` per note (linked to its slug.html)
- "Concept index" section: a `<ul>` of the top concepts by count (descending). Each item: concept name + count badge (e.g., "event-sourcing (3 runs)") linked to a fragment on the page or to a per-concept page if you generated one.
- "Stats" footer: `<p>` with `pages_count`, `concept_count`, `runs_processed`.
5. **Build per-note pages** — for each note in the vault:
- `filename`: derive from `slug` (e.g., `concepts/event-sourcing.html`). Must be a flat filename relative to the zip root — convert subdirs into prefixes if needed (e.g., a note at `concepts/event-sourcing.md` becomes filename `concepts-event-sourcing.html`). Prefer flat layout for the zip.
- `title`: the note's H1 or filename if no H1.
- `body_html`: render the Markdown to HTML. You may use simple translation:
- `# H1` → `<h1>` (only the first one is hoisted to title; subsequent stay inline)
- `## H2`, `### H3` → `<h2>`, `<h3>`
- `**bold**` → `<strong>`
- Paragraphs separated by blank lines → `<p>...</p>`
- `[text](url)` → `<a href="url">text</a>`
- `[[wiki]]` → `<a href="wiki.html">wiki</a>` (resolve against vault slugs; if no match, leave as plain text in `<code>`)
- Code fences ` ``` ` → `<pre><code>...</code></pre>` (escape HTML in code)
- Lists: contiguous `- item` lines → `<ul><li>...</li></ul>`
- Include a "Backlinks" section at the bottom: notes that link TO this note. Render as a `<ul>` of `<a>`. Omit the section if no backlinks.
6. **Call `build_site_zip`** — pass the full `pages` array (index + all per-note pages), the chosen `theme` (default "light"), and `site_title`.
7. **Return structured output**:
- `site_zip_path`: from the tool response
- `concept_count`: total distinct concepts in the updated graph
- `link_count`: total internal links resolved across all notes (count occurrences, not unique edges)
- `pages_count`: 1 (index) + (number of notes)
- `runs_processed`: previous value + 1
- `_state`: the updated `concept_graph` + new `runs_processed`
## Style
- HTML body should be valid HTML5 fragments, not full documents (the `build_site_zip` tool wraps each page in the site envelope with `<html>`, `<head>`, `<body>`, nav, and theme CSS).
- Don't include `<style>` blocks per page — themes are handled by the bundler.
- Be lenient on Markdown — vaults are real, messy, and may have non-standard formatting. Render what you can; don't crash on edge cases.
- Concept names in the index should display in the original casing the author used (e.g., "Event Sourcing", not "event-sourcing"). The slug stays kebab-case for the link.
## Failure modes
- Empty vault (no `.md` files): produce an `index.html` with a friendly "No notes yet — drop a `.md` file in the vault directory" message. Return `pages_count: 1`, `concept_count: 0`.
- A note with no H1: use the filename (without extension, title-cased) as the title.
- Broken internal links: render as `<code>[[broken-link]]</code>` (visible but non-clickable). Don't fail the run.Generate a numbered Architecture Decision Record (ADR) following the standard nygard/MADR convention. Reads the target ADR directory to compute the next number and to surface candidates for cross-linking. Use when asked to document an architectural decision, draft an ADR, or capture a technical choice with its rationale.
Generate a polished CHANGELOG.md and release-notes.md from a local git repository (or a captured `.git-log.txt` dump). Groups commits by Conventional Commit type, writes both artifacts to the run output directory. Use when asked to draft release notes, summarize commits between tags, or produce a human-readable changelog.
Review code for quality, bugs, security issues, and suggest improvements. Use when asked to review, audit, or improve code.
Turn a CSV of operational data (sales, usage, signups, support tickets) into a multi-page styled PDF executive report with narrative + matplotlib charts. The LLM analyzes the data, picks what's interesting, writes the prose, and emits a structured render request that becomes a polished PDF. Use when given a CSV and asked for a report, summary, or analysis.
Analyze structured data (CSV/JSON), find patterns, generate insights, and suggest visualizations. Use for data analysis tasks.
Draft professional emails based on context, tone, and recipient. Use for composing business emails.
Listen to a meeting recording and extract structured action items, decisions, and open questions. Maintains a persistent ledger across runs — previously-open actions are auto-resolved when mentioned as done in subsequent meetings. Outputs `actions.csv` (importable to Linear/Asana/Notion) + `recap.md` (paste into Slack). Use when given a meeting recording and asked for a recap or action items.
Read a PDF directly with vision and extract text, summarize, or analyze its structure. Use when the user passes a PDF file.