wiki_capture
**wiki_capture** stores durable organizational knowledge such as metric definitions, filter conventions, domain rules, and system mappings that should be reused across multiple analytics conversations. Use this skill to capture user-expressed or ingested business rules and conventions, while avoiding one-off requests, temporary instructions, query snapshots, and information already defined in the semantic layer. The tool automatically routes captures to personal scope for user-driven chats or global scope for admin ingests, depending on configuration.
git clone --depth 1 https://github.com/Kaelio/ktx /tmp/wiki_capture && cp -r /tmp/wiki_capture/packages/cli/src/skills/wiki_capture ~/.claude/skills/wiki_captureSKILL.md
# Wiki Capture
## Role
The knowledge base stores durable, reusable business knowledge for an analytics assistant. Each page is a self-contained rule, definition, or convention that answers "how should this concept be handled in this organization?" - written once and reused across chats.
Scope selection is handled by the runtime:
- When user-scoped knowledge is enabled AND the caller is a chat turn, writes go to the user's **personal** scope.
- When the caller is an admin-driven ingest (`sourceType: 'external_ingest'`), writes go to the **global** scope.
- When user-scoped knowledge is disabled, all writes go to the global scope.
The `wiki_write` tool picks the right scope based on the session. Capture logic does not need to choose - focus on whether the content is worth capturing at all.
## What to capture
Capture when the user or the ingested document expresses:
- A metric definition ("revenue means booked revenue after refunds").
- A filter or convention that should always apply ("exclude test accounts when reporting ARR").
- A mapping or alias ("mood_stress_sleep = Oxytocin protocol").
- A domain rule that is not visible from column names alone ("status = 'T' means terminated, not 'terminated'").
- A link or external system convention ("medplum_patient_id is the primary key in the EMR at https://emr.example/patients/{id}").
Do NOT capture:
- One-off requests ("answer under 100 words").
- Temporary instructions scoped to the current chat.
- Ad-hoc formatting preferences.
- Information already present in the semantic layer (column names, join paths, measure formulas - those belong in SL).
- **Query results, snapshots, or time-bounded benchmark tables.** Numbers go stale; pasting "Oct 2025: 25%, Nov 2025: 19.9%, …" creates misinformation as soon as new data lands. Reference the SL source by name (`sl_refs`) and let future query tools pull live data - the wiki captures the *rule* (definition, exclusion, segmentation), the SL source captures the *measure*, and query execution captures the *current values*.
- **Interpretive narrative tied to a specific snapshot** ("M1 retention degraded sharply from Dec 2025"). The observation is anchored to data that will move; the actionable convention (e.g., "always exclude in-progress cohorts") may be worth capturing on its own, but the snapshot-specific commentary is not.
If nothing is worth capturing, respond without calling any tool.
## Workflow
1. Read the wiki index (provided in the prompt) and decide whether the turn introduces durable knowledge.
2. **Before writing**, search for related content so cross-references are accurate:
- `discover_data` first when a page relates to data or SL concepts - find
existing wiki pages, SL sources, and raw warehouse schema together.
- `wiki_search` with the topic - find related wiki pages to populate `refs`.
- `sl_discover` with the concept - if the page defines a metric (revenue, churn, retention, LTV, ARR, MRR, CAC, attribution, etc.), find matching SL sources or measures to populate `sl_refs`. If no matches, pass `sl_refs: []` so future readers know you checked.
3. If updating an existing page, `wiki_read` it first. Use the returned `structured.content` or markdown body as the exact stored text for targeted replacements; current tags, refs, and sl_refs are returned in structured metadata.
4. `wiki_write` to create or update. Prefer merging into an existing page over creating a new one.
5. `wiki_remove` only when a page is truly obsolete - not to replace stale content (update it instead).
For bundle/external ingest, include `rawPaths` on every `wiki_write`/`wiki_remove` call with only the raw files that directly support that wiki action. This keeps ingest provenance tied to the actual source file, not every file in the WorkUnit.
## Identifier Verification Protocol
Before writing a wiki page or SL source on any topic:
1. `discover_data({query: "<topic>"})` - see what wikis, SL sources, and raw
tables already exist. Prefer updating existing pages over creating new ones.
Before emitting any `schema.table` or `schema.table.column` into a wiki body,
SL source, `tables:` frontmatter, `sl_refs`, or `emit_unmapped_fallback`:
2. `entity_details({connectionId, targets: [{display: "<identifier>"}]})` -
confirm the identifier resolves; inspect native types, FK/PK, and
sampleValues.
3. For literal values from the source, such as status codes or plan tiers,
check whether they appear in `entity_details` sampleValues for the relevant
column. If sampleValues is short or the sample may have missed real values,
run a `sql_execution` probe with the same warehouse connection id:
`sql_execution({connectionId, sql: "SELECT DISTINCT <col> FROM <ref> LIMIT 50"})`.
4. If the candidate identifier still does not resolve, do one of:
- Use `sql_execution({connectionId, sql: "SELECT 1 FROM <ref> LIMIT 0"})`.
If it errors, the identifier is fictional.
- Wrap the identifier in `[unverified - from <rawPath>]` in the wiki body,
citing the exact raw path that mentioned it.
- When recording `emit_unmapped_fallback` with `no_physical_table`, include
the failing probe error in `clarification`.
5. Never copy `<schema>.<table>` placeholder strings from these instructions
into output.
## Keys, summaries, and content
- **Keys** are short kebab-case topic identifiers: `leads-source-filter`, `revenue-definition`, `churn-calculation`. No namespacing, no prefixes.
- **Summary** is a one-line hook (≤200 chars) shown in the index.
- **Content** is concise markdown - actionable rules, not prose.
```
## [Topic Title]
- Rule or preference statement
- Another rule if applicable
```
Prefer fewer, richer pages over many thin ones. Each page covers one coherent topic thoroughly. If the new information relates to an existing page, update that page instead of fragmenting the knowledge.
## Tags, refs, sl_refs
The `wiki_write` tool accepts three array fields that go into the page frontUse when answering a question that needs data from a ktx-connected database - investigating, analyzing, "how many", "show me", "what's the breakdown of", finding records by value, exploring tables, comparing periods, explaining metrics, or any data-analysis request. Triggers even when the user does not say "analytics"; if the answer requires querying a configured ktx connection, this skill applies.
Map dbt `schema.yml` / `properties.yml` models and sources into ktx semantic-layer overlays and column notes. Covers `sources:` vs `models:`, column `data_tests` (not_null, unique, accepted_values, relationships), and how bundle-time writes complement manifest backfill from git sync. Load when the WorkUnit's `skillNames` includes `dbt_ingest` or when raw files are dbt YAML under `models/` / `sources/`.
Identify recurring cross-table historic-SQL analytical intents from a bounded pattern shard and emit typed pattern evidence for deterministic wiki projection.
Convert one changed historic-SQL table usage bucket into typed table usage evidence for deterministic _schema projection.
Classify and resolve conflicts detected during bundle ingest (structural duplicates, definitional contradictions, near-duplicate clusters, re-ingest changes, evictions).
Capture semantic-layer and knowledge updates from a live database schema snapshot.
Extract durable ktx knowledge and semantic-layer contribution proposals from staged Looker runtime dashboard, Look, and explore JSON. Load for WorkUnits whose raw files are under explores/, dashboards/, or looks/.
Map a LookML view/model/explore into ktx semantic layer sources. Covers the LookML to ktx primitive table, provenance tagging, and three worked examples (overlay, standalone from derived_table, standalone with sql_always_where). Load when the turn contains `.lkml` content.