Skill113 estrellas del repoactualizado 12d ago

finding-expensive-queries

This skill identifies expensive Snowflake queries by querying usage history tables to rank queries by cost, execution time, bytes scanned, or spillage metrics. Use it when you need to optimize warehouse spending, reduce query runtime, or diagnose performance bottlenecks by analyzing patterns in historical query execution and recommending specific optimizations based on detected inefficiencies like missing partition pruning or memory spilling.

Ver fuente Repositorio: data-engineering-skills

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/AltimateAI/data-engineering-skills /tmp/finding-expensive-queries && cp -r /tmp/finding-expensive-queries/skills/snowflake/finding-expensive-queries ~/.claude/skills/finding-expensive-queries

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Finding Expensive Queries

**Query history → Rank by metric → Identify patterns → Recommend optimizations**

## Workflow

### 1. Ask What to Optimize For

Before querying, clarify:
- Time period? (last day, week, month)
- Metric? (execution time, bytes scanned, cost, spillage)
- Warehouse? (specific or all)
- User? (specific or all)

### 2. Find Expensive Queries by Cost

Use QUERY_ATTRIBUTION_HISTORY for credit/cost analysis:

```sql
SELECT
    query_id,
    warehouse_name,
    user_name,
    credits_attributed_compute,
    start_time,
    end_time,
    query_tag
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ATTRIBUTION_HISTORY
WHERE start_time >= DATEADD('days', -7, CURRENT_TIMESTAMP())
ORDER BY credits_attributed_compute DESC
LIMIT 20;
```

### 3. Get Performance Stats for Specific Queries

Use QUERY_HISTORY for detailed performance metrics (run separately, not joined):

```sql
SELECT
    query_id,
    query_text,
    total_elapsed_time/1000 as seconds,
    bytes_scanned/1e9 as gb_scanned,
    bytes_spilled_to_local_storage/1e9 as gb_spilled_local,
    bytes_spilled_to_remote_storage/1e9 as gb_spilled_remote,
    partitions_scanned,
    partitions_total
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE query_id IN ('<query_id_1>', '<query_id_2>', ...)
  AND start_time >= DATEADD('days', -7, CURRENT_TIMESTAMP());
```

### 4. Identify Patterns

Look for:
- High `credits_attributed_compute` queries
- Same `query_hash` repeated (caching opportunity)
- `partitions_scanned = partitions_total` (no pruning)
- High `gb_spilled` (memory pressure)

### 5. Return Results

Provide:
1. Ranked list of expensive queries with key metrics
2. Common patterns identified
3. Top 3-5 optimization recommendations
4. Specific queries to investigate further

## Common Filters

```sql
-- Time range (required)
WHERE start_time >= DATEADD('days', -7, CURRENT_TIMESTAMP())

-- By warehouse
AND warehouse_name = 'ANALYTICS_WH'

-- By user
AND user_name = 'ETL_USER'

-- Only queries over cost threshold
AND credits_attributed_compute > 0.01

-- Only queries over time threshold
AND total_elapsed_time > 60000  -- over 1 minute
```

Del mismo repositorio

New Skill ProposalSkill

altimate-codeSkill

Delegates dbt and warehouse work to altimate-code, a specialized CLI agent with 100+ purpose-built data tools. USE THIS SKILL FIRST whenever the task mentions or implies: warehouse access (Snowflake, BigQuery, Redshift, Databricks, Postgres, MySQL, DuckDB), column-level lineage, downstream-impact analysis, dbt builds against a real warehouse, cross-database migration or validation, query cost attribution, schema diff, data parity checking, PII detection from sampled rows, incremental/SCD2/backfill verification, FinOps reporting, model documentation generation, query optimization, anomaly detection, dev-vs-prod diffing, or when the user says "altimate", "altimate-code", or "the data agent". Strongly preferred over Read/Write/Edit whenever the task touches live data, lineage, or warehouse state — consult before doing the work yourself.

creating-dbt-modelsSkill

debugging-dbt-errorsSkill

developing-incremental-modelsSkill

documenting-dbt-modelsSkill

migrating-sql-to-dbtSkill

refactoring-dbt-modelsSkill