Skip to main content
ClaudeWave
Install in Claude Code
Copy
git clone --depth 1 https://github.com/AltimateAI/data-engineering-skills /tmp/documenting-dbt-models && cp -r /tmp/documenting-dbt-models/skills/dbt/documenting-dbt-models ~/.claude/skills/documenting-dbt-models
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# dbt Documentation

**Document the WHY, not just the WHAT. Include grain, business rules, and caveats.**

## Workflow

### 1. Study Existing Documentation Patterns

**CRITICAL: Match the project's documentation style before adding new docs.**

```bash
# Find all schema.yml files with documentation
find . -name "schema.yml" | head -5

# Read well-documented models to learn patterns
cat models/marts/schema.yml | head -150
cat models/staging/schema.yml | head -150
```

**Extract from existing documentation:**
- Description length (brief vs detailed)
- Formatting style (plain text vs markdown with headers)
- Information included (grain? business rules? caveats?)
- Column description depth (all columns vs key columns)
- Use of meta tags or custom properties

### 2. Read Model SQL

```bash
cat models/<path>/<model_name>.sql
```

Understand: transformations, business logic, joins, filters.

### 3. Check Existing Documentation for This Model

```bash
# Find existing schema.yml
find . -name "schema.yml" -exec grep -l "<model_name>" {} \;

# Read existing docs
cat models/<path>/schema.yml | grep -A 100 "<model_name>"
```

### 4. Identify Documentation Needs

For each model, document:
- **Model description**: Purpose, grain, key business rules
- **Column descriptions**: Business meaning, not just data type

For each column, consider:
- What business concept does this represent?
- Are there any caveats or special values?
- What is the source of this data?

### 5. Write Documentation

**Match the style discovered in step 1. Example format (adapt to project):**

```yaml
version: 2

models:
  - name: orders
    description: |
      Order transactions at the order line item grain.
      Each row represents one product in one order.

      **Business Rules:**
      - Revenue recognized on ship_date, not order_date
      - Cancelled orders excluded (status != 'cancelled')
      - Returns processed as negative line items

      **Grain:** One row per order_id + product_id combination

    columns:
      - name: order_id
        description: |
          Unique identifier for the order.
          Source: orders.id from Stripe webhook

      - name: customer_id
        description: |
          Foreign key to customers table.
          NULL for guest checkouts (pre-2023 only)

      - name: revenue
        description: |
          Net revenue for this line item in USD.
          Calculation: unit_price * quantity - discount_amount
          Excludes tax and shipping

      - name: order_status
        description: |
          Current status of the order.
          Values: pending, processing, shipped, delivered, cancelled, returned
```

### 6. Generate Docs

```bash
dbt docs generate
dbt docs serve  # Optional: preview locally
```

## Documentation Patterns

**Note: These are default templates. Always adapt to match project's existing style.**

### Model Description Template

```yaml
description: |
  [One sentence: what this model contains]

  **Grain:** [What does one row represent?]

  **Business Rules:**
  - [Key rule 1]
  - [Key rule 2]

  **Caveats:**
  - [Important limitation or edge case]
```

### Column Description Patterns

| Column Type | Documentation Focus |
|-------------|---------------------|
| Primary key | Source system, uniqueness guarantee |
| Foreign key | What it joins to, NULL handling |
| Metric | Calculation formula, units, exclusions |
| Date | Timezone, what event it represents |
| Status/Category | All possible values, business meaning |
| Boolean/Flag | What true/false means in business terms |

### Documenting Calculated Fields

```yaml
- name: gross_margin
  description: |
    Gross margin percentage.
    Calculation: (revenue - cogs) / revenue * 100
    NULL when revenue = 0 to avoid division by zero
```

## Anti-Patterns

- Adding documentation without checking existing project patterns
- Using different formatting style than existing documentation
- Describing WHAT (e.g., "The order ID") instead of WHY/context
- Missing grain documentation
- Not documenting NULL handling
- Leaving columns undocumented
- Copy-pasting column names as descriptions