rag-dev
The rag-dev skill provides tools for building knowledge bases and running semantic search over private documents using Butterbase RAG. Use it when ingesting text or files into collections, polling ingestion status, performing similarity searches across embedded document chunks, and optionally synthesizing LLM-generated answers from retrieved content. The skill handles asynchronous document processing with pgvector embeddings and supports configurable access modes for controlling who can query each collection.
git clone --depth 1 https://github.com/butterbase-ai/butterbase-skills /tmp/rag-dev && cp -r /tmp/rag-dev/skills/rag-dev ~/.claude/skills/rag-devSKILL.md
# Butterbase RAG (Retrieval-Augmented Generation)
Two tools cover the entire RAG surface:
- **`manage_rag_content`** — collections, document ingestion, status polling, deletion
- **`rag_query`** — semantic search, optional LLM synthesis
Documents are ingested asynchronously: text or files become embeddings stored in pgvector, and queries do a similarity search at runtime.
---
## 1. The mental model
```
Collection Documents Chunks
────────── ────────── ──────
"product-faq" ──────────────► doc_1 (PDF) ───────────► chunk 1, 2, 3...
doc_2 (text) ──────────► chunk 4, 5...
doc_3 (markdown) ──────► chunk 6...
```
A **collection** holds documents; a **document** is split into **chunks** and embedded; **`rag_query`** searches by cosine similarity across chunks within a collection.
`chunk_size` and `chunk_overlap` are set **once at collection creation** and immutable — to change them, delete and recreate the collection.
---
## 2. End-to-end workflow
```
┌────────────────────────────────────────────┐
│ 1. create_collection (once per knowledge) │
├────────────────────────────────────────────┤
│ 2. ingest_document (text OR storage_object)│
├────────────────────────────────────────────┤
│ 3. poll get_document_status until "ready" │
├────────────────────────────────────────────┤
│ 4. rag_query (with or without synthesis) │
└────────────────────────────────────────────┘
```
### Step 1 — create the collection
```js
manage_rag_content({
app_id: "app_abc123",
action: "create_collection",
name: "product-faq",
description: "Customer-facing product knowledge",
chunk_size: 512, // optional, default 512 tokens
chunk_overlap: 50, // optional, default 50 tokens
access_mode: "shared" // optional: "private" | "shared" | "custom"
})
```
| `access_mode` | Who can query |
|----------------|---------------|
| `private` (default) | Only the app owner / service key |
| `shared` | Any authenticated end-user with a valid JWT |
| `custom` | Respects RLS policies — for fine-grained control |
### Step 2a — ingest raw text
```js
manage_rag_content({
app_id: "app_abc123",
action: "ingest_document",
collection: "product-faq",
text: "Our return policy is 30 days from purchase...",
filename: "return-policy.txt", // optional, for display
metadata: { category: "returns", tier: "all" } // filter later in rag_query
})
// → { document_id: "doc_xyz", status: "pending" }
```
### Step 2b — ingest an uploaded file
Files come from `manage_storage` first. Two-step:
```js
// 1. Upload the file via the storage skill — get an object_id
const { object_id } = await uploadPdfViaStorage(...);
// 2. Hand that object_id to RAG ingestion
manage_rag_content({
app_id: "app_abc123",
action: "ingest_document",
collection: "product-faq",
storage_object_id: object_id,
filename: "manual.pdf",
metadata: { product: "v3" }
})
```
Supported file types: **PDF, TXT, Markdown, CSV, HTML, DOCX, XLSX, PPTX**.
### Step 3 — poll until ready
Ingestion is fire-and-forget. The document moves through `pending → processing → ready` (or `failed`). Poll:
```js
manage_rag_content({
app_id: "app_abc123",
action: "get_document_status",
collection: "product-faq",
document_id: "doc_xyz"
})
// → { id, filename, status: "processing", processedAt, errorMessage? }
```
Recommended cadence: poll every 2–5 seconds for the first minute, back off after that. Bigger files (large PDFs, XLSX) take longer.
### Step 4 — query
Two modes: raw retrieval (just chunks back) or synthesized (LLM answer + sources).
#### Raw retrieval
```js
rag_query({
app_id: "app_abc123",
collection: "product-faq",
query: "How long do I have to return an item?",
top_k: 5, // default 5, max 20
threshold: 0.7, // optional similarity floor (0..1)
filter: { category: "returns" } // optional metadata filter
})
// → { chunks: [{ text, score, document_id, metadata }, ...] }
```
#### Synthesized answer
```js
rag_query({
app_id: "app_abc123",
collection: "product-faq",
query: "How long do I have to return an item?",
synthesize: true,
model: "anthropic/claude-haiku-4.5" // default
})
// → { answer, chunks, model }
```
`synthesize: true` runs the retrieved chunks through an LLM and returns a grounded answer. `chunks` is still included so you can show citations.
---
## 3. Listing and cleanup
```js
manage_rag_content({ app_id, action: "list_collections" })
manage_rag_content({ app_id, action: "get_collection", name: "product-faq" })
manage_rag_content({ app_id, action: "list_documents", collection: "product-faq" })
manage_rag_content({ app_id, action: "delete_document", collection: "product-faq", document_id: "doc_xyz" })
manage_rag_content({ app_id, action: "delete_collection", name: "product-faq" })
```
`get_collection` returns `{ name, description, accessMode, chunkSize, chunkOverlap, createdAt, documentCount: { pending, processing, ready, failed } }` — handy for a dashboard view.
> **Both `delete_document` and `delete_collection` are irreversible** and remove embeddings. To replace a document, delete then re-ingest.
---
## 4. Choosing chunk size and overlap
| Use case | Suggested `chunk_size` | `chunk_overlap` |
|----------|------------------------|------------------|
| Q&A over short FAQs / docs | 256–512 | 50 |
| Long-form documentation, manuals | 512–1024 | 100 |
| Code or structured content | 1024–2048 | 0–50 |
| Conversational logs / transcripts | 256 | 50 |
Larger chunks preserve more context but reduce retrieval granularity (you may pull in irrelevant nearby content). Overlap prevents semantic splits at boundaries from losing meaning. **You can't change these without recreating the collection** — pick them deliberately the first time.
---
## 5. Metadata-driven filtering
AnythingClaude Code plugin for Butterbase — 30+ guided skills and auto-configured MCP for the AI-native backend-as-a-service.
Use when calling the app's AI gateway from agent tools — chat completions, embeddings, listing models, configuring defaults or BYOK, reading token/cost usage
Configure OAuth providers, auth hooks, JWT lifetimes, and service keys for a Butterbase app
Use when building a new Butterbase app from scratch, creating a full-stack application, or when the user asks to set up a complete backend with database, auth, and deployment
Use when users report access denied errors, see wrong data, RLS policies are not working, or when troubleshooting Row-Level Security issues in Butterbase
Deploy a frontend (React, Next.js, or static HTML) to a live URL on Butterbase
Use when building stateful per-key actors — chat rooms, multiplayer rooms, rate limiters, long-running agents, leaderboards — that need persistent in-memory + storage state across requests
Develop, deploy, or debug a Butterbase serverless function