Skip to main content
ClaudeWave
Skill1.2k repo starsupdated today

ktx

The ktx skill installs and configures ktx, an open-source context layer for data agents, by running non-interactive setup with hidden CLI flags, establishing database and embedding connections, integrating agent rules, and verifying configuration readiness. Use this skill when users request ktx installation, data source connection, agent rule setup, schema ingestion, or troubleshooting of local ktx installations.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/Kaelio/ktx /tmp/ktx && cp -r /tmp/ktx/skills/ktx ~/.claude/skills/ktx
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# ktx

Install and configure **ktx**, the open-source context layer for data agents.
Use this skill when a user wants an agent to add **ktx** to a project, connect
data sources, build initial context, install agent integration, or troubleshoot
a local **ktx** setup.

## Operating rules

- Act autonomously when the user asks you to install or configure **ktx**.
  The non-interactive scripted flow below is the canonical path — bare
  `ktx setup` is interactive (clack prompts) and an agent cannot drive it.
- Setup's non-interactive flags are intentionally hidden from `--help`. Use the
  flags listed below; verify uncommon flags against the docs at
  `https://docs.kaelio.com/ktx/` or this skill — not against `--help` output.
- Ask only for values you cannot infer: project directory, connection targets,
  credentials, account identifiers, and source selections.
- Prefer `file:/abs/path` secret refs over `env:VAR_NAME`. `env:` refs are
  re-resolved against the process environment on **every** `ktx` run, so a var
  exported only in the setup shell is gone when `ktx ingest` or `ktx mcp start`
  runs later — the secret silently resolves to empty and the connection fails.
  `file:` refs read from disk and survive across shells. The same caveat
  applies to `--*-api-key-env` flags: the named var must be present in every
  shell that runs `ktx`, including the `ktx mcp` daemon's environment.
- A literal database URL is safe to pass — `ktx setup` auto-externalizes it
  into `.ktx/secrets/<id>-url` and rewrites `ktx.yaml` to a `file:` ref (see
  workflow step 2). Source credential refs are **not** auto-externalized: write
  the secret to a file under `.ktx/secrets/` (`chmod 600`) and pass a `file:`
  ref. Never ask the user to paste a secret when a `file:` or `env:` ref works.
- Do not commit `.ktx/secrets/*`.
- Print each command you run and its result.
- Setup and ingest can run for many minutes (LLM-heavy source ingests take the
  longest), and from the outside a slow step looks identical to a stuck one.
  Don't go silent: say what's about to run and that it may take a while, then
  post brief progress/liveness updates while it runs (see step 4) so the user
  never has to wonder whether it stalled — otherwise they may kill it mid-run.
- If a command fails, identify the cause and change something before retrying.

## Gather inputs once

Before invoking `ktx setup`, collect in one round:

1. Project directory (default: current working directory).
2. LLM backend and key strategy. In `--no-input` mode the CLI defaults to
   `anthropic` and **requires an API key**. When the user is inside Claude
   Code, pass `--llm-backend claude-code` explicitly; otherwise pass
   `--llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY`.
3. Embedding backend (`sentence-transformers` is the local default and needs
   no key; use `openai` only if the user already has a key, then pass
   `--embedding-api-key-env OPENAI_API_KEY`).
4. Database: driver, connection id, URL (or `env:` / `file:` ref), and one or
   more schemas.
5. Optional context sources (dbt, Metabase, Looker, LookML, MetricFlow,
   Notion). Add each one with a follow-up `ktx setup --source …` run (see
   [Add context sources](#add-context-sources)); use `--skip-sources` only
   when the user has none.

Do not discover these inputs across multiple setup runs.

## Install workflow

1. **Detect the install path.** If the working directory contains
   `packages/cli/dist/bin.js` or `pnpm-workspace.yaml` referencing
   `@kaelio/ktx` you are inside the **ktx** monorepo — build and link the
   local CLI with `pnpm` and do **not** run `npm install -g`. Otherwise:

   ```bash
   node --version    # require >= 22; stop and ask the user if older
   ktx --version || npm install -g @kaelio/ktx
   ```

2. **Run scripted setup** (canonical path):

   ```bash
   ktx setup --no-input --yes \
     --project-dir <path> \
     --llm-backend claude-code \
     --embedding-backend sentence-transformers \
     --database <driver> --database-connection-id <id> \
     --database-url '<raw-url | file:/abs/path>' \
     --database-schema <schema> \
     --skip-sources \
     --skip-agents
   ```

   - `--database-schema` is required for scope-bearing drivers (Postgres,
     MySQL, ClickHouse, SQL Server, BigQuery, Snowflake) in `--no-input`:
     setup fails fast without it unless the connection already has scope in
     `ktx.yaml`. SQLite needs no scope.
   - Configure one new database connection per setup invocation. For multiple
     connections, rerun setup once per connection.
   - Pasting a literal `--database-url` is safe: the CLI relocates the URL
     into `.ktx/secrets/<connection-id>-url` and rewrites `ktx.yaml` to a
     `file:` ref automatically.
   - `ktx setup` runs agent integration as its **last** step. In `--no-input`
     mode with neither `--target` nor `--skip-agents`, that step has no input,
     prints `Run in a TTY, or pass --target <target>.`, and the command exits
     non-zero **even though every database/LLM/embedding step succeeded**. Pass
     `--skip-agents` to defer agents to step 5 (as above), or `--target <agent>`
     to install them inline and exit 0. Judge data-layer success from
     `ktx status`, not from this exit code.

3. **Resumability and `--skip-*`.** Re-running `ktx setup` against an existing
   project resumes its config. Use `--skip-llm`, `--skip-databases`,
   `--skip-sources`, or `--skip-embeddings` to leave a slice unconfigured but
   let the rest complete instead of aborting on the first failure. **When
   resuming an existing project to change one slice (e.g. only LLM), still
   pass the database flags from the previous run** — setup validates current
   flags, not persisted `ktx.yaml` state.

4. **Build context** if setup did not already complete one:

   ```bash
   ktx ingest <connection-id> --no-input
   ```

   `ktx ingest` always builds enriched context and requires a configured model
   and embeddings (set during
ktx-analyticsSkill

Use when answering a question that needs data from a ktx-connected database - investigating, analyzing, "how many", "show me", "what's the breakdown of", finding records by value, exploring tables, comparing periods, explaining metrics, or any data-analysis request. Triggers even when the user does not say "analytics"; if the answer requires querying a configured ktx connection, this skill applies.

dbt_ingestSkill

Map dbt `schema.yml` / `properties.yml` models and sources into ktx semantic-layer overlays and column notes. Covers `sources:` vs `models:`, column `data_tests` (not_null, unique, accepted_values, relationships), and how bundle-time writes complement manifest backfill from git sync. Load when the WorkUnit's `skillNames` includes `dbt_ingest` or when raw files are dbt YAML under `models/` / `sources/`.

historic_sql_patternsSkill

Identify recurring cross-table historic-SQL analytical intents from a bounded pattern shard and emit typed pattern evidence for deterministic wiki projection.

historic_sql_table_digestSkill

Convert one changed historic-SQL table usage bucket into typed table usage evidence for deterministic _schema projection.

ingest_triageSkill

Classify and resolve conflicts detected during bundle ingest (structural duplicates, definitional contradictions, near-duplicate clusters, re-ingest changes, evictions).

live_database_ingestSkill

Capture semantic-layer and knowledge updates from a live database schema snapshot.

looker_ingestSkill

Extract durable ktx knowledge and semantic-layer contribution proposals from staged Looker runtime dashboard, Look, and explore JSON. Load for WorkUnits whose raw files are under explores/, dashboards/, or looks/.

lookml_ingestSkill

Map a LookML view/model/explore into ktx semantic layer sources. Covers the LookML to ktx primitive table, provenance tagging, and three worked examples (overlay, standalone from derived_table, standalone with sql_always_where). Load when the turn contains `.lkml` content.