Skill102 repo starsupdated today
documenting-dbt-models
|
Install in Claude Code
Copygit clone --depth 1 https://github.com/AltimateAI/data-engineering-skills /tmp/documenting-dbt-models && cp -r /tmp/documenting-dbt-models/skills/dbt/documenting-dbt-models ~/.claude/skills/documenting-dbt-modelsThen start a new Claude Code session; the skill loads automatically.
Definition
SKILL.md
# dbt Documentation
**Document the WHY, not just the WHAT. Include grain, business rules, and caveats.**
## Workflow
### 1. Study Existing Documentation Patterns
**CRITICAL: Match the project's documentation style before adding new docs.**
```bash
# Find all schema.yml files with documentation
find . -name "schema.yml" | head -5
# Read well-documented models to learn patterns
cat models/marts/schema.yml | head -150
cat models/staging/schema.yml | head -150
```
**Extract from existing documentation:**
- Description length (brief vs detailed)
- Formatting style (plain text vs markdown with headers)
- Information included (grain? business rules? caveats?)
- Column description depth (all columns vs key columns)
- Use of meta tags or custom properties
### 2. Read Model SQL
```bash
cat models/<path>/<model_name>.sql
```
Understand: transformations, business logic, joins, filters.
### 3. Check Existing Documentation for This Model
```bash
# Find existing schema.yml
find . -name "schema.yml" -exec grep -l "<model_name>" {} \;
# Read existing docs
cat models/<path>/schema.yml | grep -A 100 "<model_name>"
```
### 4. Identify Documentation Needs
For each model, document:
- **Model description**: Purpose, grain, key business rules
- **Column descriptions**: Business meaning, not just data type
For each column, consider:
- What business concept does this represent?
- Are there any caveats or special values?
- What is the source of this data?
### 5. Write Documentation
**Match the style discovered in step 1. Example format (adapt to project):**
```yaml
version: 2
models:
- name: orders
description: |
Order transactions at the order line item grain.
Each row represents one product in one order.
**Business Rules:**
- Revenue recognized on ship_date, not order_date
- Cancelled orders excluded (status != 'cancelled')
- Returns processed as negative line items
**Grain:** One row per order_id + product_id combination
columns:
- name: order_id
description: |
Unique identifier for the order.
Source: orders.id from Stripe webhook
- name: customer_id
description: |
Foreign key to customers table.
NULL for guest checkouts (pre-2023 only)
- name: revenue
description: |
Net revenue for this line item in USD.
Calculation: unit_price * quantity - discount_amount
Excludes tax and shipping
- name: order_status
description: |
Current status of the order.
Values: pending, processing, shipped, delivered, cancelled, returned
```
### 6. Generate Docs
```bash
dbt docs generate
dbt docs serve # Optional: preview locally
```
## Documentation Patterns
**Note: These are default templates. Always adapt to match project's existing style.**
### Model Description Template
```yaml
description: |
[One sentence: what this model contains]
**Grain:** [What does one row represent?]
**Business Rules:**
- [Key rule 1]
- [Key rule 2]
**Caveats:**
- [Important limitation or edge case]
```
### Column Description Patterns
| Column Type | Documentation Focus |
|-------------|---------------------|
| Primary key | Source system, uniqueness guarantee |
| Foreign key | What it joins to, NULL handling |
| Metric | Calculation formula, units, exclusions |
| Date | Timezone, what event it represents |
| Status/Category | All possible values, business meaning |
| Boolean/Flag | What true/false means in business terms |
### Documenting Calculated Fields
```yaml
- name: gross_margin
description: |
Gross margin percentage.
Calculation: (revenue - cogs) / revenue * 100
NULL when revenue = 0 to avoid division by zero
```
## Anti-Patterns
- Adding documentation without checking existing project patterns
- Using different formatting style than existing documentation
- Describing WHAT (e.g., "The order ID") instead of WHY/context
- Missing grain documentation
- Not documenting NULL handling
- Leaving columns undocumented
- Copy-pasting column names as descriptionsMore from this repository
New Skill ProposalSkill
altimate-codeSkill
Delegates data engineering tasks to altimate-code, a specialized CLI agent with 100+ purpose-built data tools — SQL analysis, column-level lineage, dbt build/test/run, warehouse profiling, FinOps, and connectivity to Snowflake, BigQuery, Redshift, Databricks, Postgres, MySQL, DuckDB. Use this skill when the task needs live warehouse access, column lineage, multi-step data exploration, dbt builds against a real warehouse, or when the user explicitly invokes "altimate", "altimate-code", or "the data agent".
creating-dbt-modelsSkill
|
debugging-dbt-errorsSkill
|
developing-incremental-modelsSkill
|
migrating-sql-to-dbtSkill
|
refactoring-dbt-modelsSkill
|
testing-dbt-modelsSkill
|