Skip to main content
ClaudeWave
Install in Claude Code
Copy
git clone --depth 1 https://github.com/AltimateAI/data-engineering-skills /tmp/migrating-sql-to-dbt && cp -r /tmp/migrating-sql-to-dbt/skills/dbt/migrating-sql-to-dbt ~/.claude/skills/migrating-sql-to-dbt
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# dbt Migration

**Don't convert everything at once. Build and validate layer by layer.**

## Workflow

### 1. Analyze Legacy SQL

```bash
cat <legacy_sql_file>
```

Identify all tables referenced in the query.

### 2. Check What Already Exists

```bash
# Search for existing models/sources that reference the table
grep -r "<table_name>" models/ --include="*.sql" --include="*.yml"
find models/ -name "*.sql" | xargs grep -l "<table_name>"
```

For each table referenced in the legacy SQL:
1. Check if an existing model already references this table
2. Check if a source definition exists
3. If neither exists, ask user: "Table X not found - should I create it as a source?"

Only proceed to intermediate/mart layers after all dependencies exist.

### 3. Create Missing Sources

```yaml
# models/staging/sources.yml
version: 2

sources:
  - name: raw_database
    schema: raw_schema
    tables:
      - name: orders
        description: Raw orders from source system
      - name: customers
        description: Raw customer records
```

### 4. Build Staging Layer

One staging model per source table. Follow existing project naming conventions.

**Build before proceeding:**
```bash
dbt build --select <staging_model>
```

### 5. Build Intermediate Layer (if needed)

Extract complex joins/logic into intermediate models.

**Build incrementally:**
```bash
dbt build --select <intermediate_model>
```

### 6. Build Mart Layer

Final business-facing model with aggregations.

### 7. Validate Migration

```bash
# Build entire lineage
dbt build --select +<final_model>
dbt show --select <final_model>
```

## Migration Checklist

- [ ] All source tables identified and documented
- [ ] Sources.yml created with descriptions
- [ ] Staging models: 1:1 with sources, renamed columns
- [ ] Intermediate models: business logic extracted
- [ ] Mart models: final aggregations
- [ ] Each layer compiles successfully
- [ ] Each layer builds successfully
- [ ] Row counts match original (manual validation)
- [ ] Tests added for key constraints

## Common Migration Patterns

- Nested subqueries → Separate models (staging → intermediate → mart)
- Temp tables → Ephemeral materialization `{{ config(materialized='ephemeral') }}`
- Hardcoded values → Variables `{{ var("name") }}`

## Anti-Patterns

- Converting entire legacy query to single dbt model
- Skipping the staging layer
- Not validating each layer before proceeding
- Keeping hardcoded values instead of using variables
- Not documenting business logic during migration