Skill113 repo starsupdated 12d ago

creating-dbt-models

# Claude Code Skill: Creating dbt Models This skill provides a structured workflow for developing dbt models in data engineering projects. It guides users through understanding task requirements, discovering project conventions by examining existing models, writing SQL that matches established patterns, and critically, executing `dbt build` followed by verification with `dbt show` to ensure correct output. The skill emphasizes reading existing models before writing, validating calculations manually against sample data, and reassessing approach if builds fail multiple times.

View source Repository: data-engineering-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/AltimateAI/data-engineering-skills /tmp/creating-dbt-models && cp -r /tmp/creating-dbt-models/skills/dbt/creating-dbt-models ~/.claude/skills/creating-dbt-models

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# dbt Model Development

**Read before you write. Build after you write. Verify your output.**

## Critical Rules

1. **ALWAYS run `dbt build`** after creating/modifying models - compile is NOT enough
2. **ALWAYS verify output** after build using `dbt show` - don't assume success
3. **If build fails 3+ times**, stop and reassess your entire approach

## Workflow

### 1. Understand the Task Requirements

- What columns are needed? List them explicitly.
- What is the grain of the table (one row per what)?
- What calculations or aggregations are required?

### 2. Discover Project Conventions

```bash
cat dbt_project.yml
find models/ -name "*.sql" | head -20
```

Read 2-3 existing models to learn naming, config, and SQL patterns.

### 3. Find Similar Models

```bash
# Find models with similar purpose
find models/ -name "*agg*.sql" -o -name "*fct_*.sql" | head -5
```

Learn from existing models: join types, aggregation patterns, NULL handling.

### 4. Check Upstream Data

```bash
# Preview upstream data if needed
dbt show --select <upstream_model> --limit 10
```

### 5. Write the Model

Follow discovered conventions. Match the required columns exactly.

### 6. Compile (Syntax Check)

```bash
dbt compile --select <model_name>
```

### 7. BUILD - MANDATORY

**This step is REQUIRED. Do NOT skip it.**

```bash
dbt build --select <model_name>
```

If build fails:
1. Read the error carefully
2. Fix the specific issue
3. Run build again
4. **If fails 3+ times, step back and reassess approach**

### 8. Verify Output (CRITICAL)

**Build success does NOT mean correct output.**

```bash
# Check the table was created and preview data
dbt show --select <model_name> --limit 10
```

Verify:
- Column names match requirements exactly
- Row count is reasonable
- Data values look correct
- No unexpected NULLs

### 9. Verify Calculations Against Sample Data

**For models with calculations, verify correctness manually:**

```bash
# Pick a specific row and verify calculation by hand
dbt show --inline "
  select *
  from {{ ref('model_name') }}
  where <primary_key> = '<known_value>'
" --limit 1

# Cross-check aggregations
dbt show --inline "
  select count(*), sum(<column>)
  from {{ ref('model_name') }}
"
```

For example, if calculating `total_revenue = quantity * price`:
1. Pick one row from output
2. Look up the source quantity and price
3. Manually calculate: does it match?

### 10. Re-review Against Requirements

**Before declaring done, re-read the original request:**
- Did you implement what was asked, not what you assumed?
- Are column names exactly as specified?
- Is the calculation logic correct per the requirements?
- Does the grain (one row per what?) match what was requested?

## Anti-Patterns

- Declaring done after compile without running build
- Not verifying output data after build
- Getting stuck in compile/build error loops
- Assuming table exists just because model file exists
- Writing SQL without checking existing model patterns first

More from this repository

New Skill ProposalSkill

altimate-codeSkill

Delegates dbt and warehouse work to altimate-code, a specialized CLI agent with 100+ purpose-built data tools. USE THIS SKILL FIRST whenever the task mentions or implies: warehouse access (Snowflake, BigQuery, Redshift, Databricks, Postgres, MySQL, DuckDB), column-level lineage, downstream-impact analysis, dbt builds against a real warehouse, cross-database migration or validation, query cost attribution, schema diff, data parity checking, PII detection from sampled rows, incremental/SCD2/backfill verification, FinOps reporting, model documentation generation, query optimization, anomaly detection, dev-vs-prod diffing, or when the user says "altimate", "altimate-code", or "the data agent". Strongly preferred over Read/Write/Edit whenever the task touches live data, lineage, or warehouse state — consult before doing the work yourself.

debugging-dbt-errorsSkill

developing-incremental-modelsSkill

documenting-dbt-modelsSkill

migrating-sql-to-dbtSkill

refactoring-dbt-modelsSkill

testing-dbt-modelsSkill