ggsql
ggsql is a SQL extension that combines standard SQL queries with declarative visualization specifications based on Grammar of Graphics principles. Use it when users want to write or understand queries that transform data and create visualizations in a single composable syntax, particularly when building charts, plots, or other graphical representations from SQL data sources.
git clone --depth 1 https://github.com/posit-dev/skills /tmp/ggsql && cp -r /tmp/ggsql/ggsql/ggsql ~/.claude/skills/ggsqlSKILL.md
# ggsql Query Writer
ggsql is a SQL extension for declarative data visualization based on Grammar of Graphics principles. It lets users combine SQL data queries with visualization specifications in a single, composable syntax.
When the user describes a visualization they want, write a valid ggsql query. Use ONLY syntax documented below. NEVER invent clauses, settings, aesthetics, or layer types.
## Query structure
A ggsql query has two parts:
1. **SQL part** (optional): Standard SQL executed on the backend. Any tables, CTEs, or SELECT results are available to the visualization.
2. **VISUALISE part** (required): Begins with `VISUALISE` (or `VISUALIZE`). Everything after this is the visualization query.
There are two patterns for combining SQL with VISUALISE:
### Pattern A: SELECT → VISUALISE
The last SQL statement is a SELECT. Data flows from its result set into VISUALISE, which has no `FROM` clause.
```ggsql
SELECT name, score_a, score_b FROM 'dataset.csv' WHERE value > 50
VISUALISE score_a AS x, score_b AS y
[DRAW / PLACE / SCALE / FACET / PROJECT / LABEL clauses]
```
Works with any SQL that ends in a SELECT: bare SELECT, WITH...SELECT, UNION/INTERSECT/EXCEPT.
### Pattern B: VISUALISE FROM
VISUALISE provides its own data source via `FROM`. Use when referencing a table, file, CTE, or built-in dataset directly without a trailing SELECT.
```ggsql
VISUALISE score_a AS x, score_b AS y FROM 'dataset.csv'
DRAW point
```
```ggsql
WITH summary AS (SELECT category, COUNT(*) AS n FROM 'dataset.csv' GROUP BY category)
VISUALISE category AS x, n AS y FROM summary
DRAW bar
```
## Data sources
Data sources can appear in `VISUALISE ... FROM` or `DRAW ... MAPPING ... FROM`:
- **Table/CTE name** (unquoted): `FROM sales`, `FROM my_cte`
- **File path** (single-quoted string): `FROM 'data.parquet'`, `FROM 'data.csv'`
- **Built-in datasets**: `FROM ggsql:penguins`, `FROM ggsql:airquality`
## VISUALISE clause
Marks the start of the visualization. Optionally defines global mappings inherited by all layers.
```
VISUALISE <mapping>, ... FROM <data-source>
```
### Mapping forms
- **Explicit**: `column AS aesthetic` — e.g. `revenue AS y`
- **Implicit**: `column` — column name must match aesthetic name, e.g. `x` maps to `x`
- **Wildcard**: `*` — all columns with names matching aesthetics are mapped
- **Constants**: `'red' AS fill`, `42 AS size` — literal values mapped to aesthetic
```ggsql
VISUALISE bill_len AS x, bill_dep AS y, species AS fill FROM ggsql:penguins
VISUALISE * FROM my_table
VISUALISE FROM ggsql:penguins
```
## DRAW clause
Defines a layer. Multiple DRAW clauses stack layers (first = bottom, last = top).
```
DRAW <layer-type>
MAPPING <mapping>, ... FROM <data-source>
REMAPPING <stat-property> AS <aesthetic>, ...
SETTING <param> => <value>, ...
FILTER <condition>
PARTITION BY <column>, ...
ORDER BY <column>, ...
```
All subclauses are optional if VISUALISE provides global mappings and data.
### MAPPING
Same syntax as VISUALISE mappings. Layer mappings merge with global mappings (layer takes precedence). Can include `FROM` for layer-specific data.
- Use `null` to prevent inheriting a global mapping: `MAPPING null AS color`
### REMAPPING
For statistical layers (histogram, density, boxplot, violin, smooth, bar without y). Maps calculated statistics to aesthetics. Each layer documents its available stats and default remapping.
```ggsql
DRAW histogram
MAPPING body_mass AS x
REMAPPING density AS y -- use density instead of default count
```
### SETTING
Set literal aesthetic values or layer parameters. Aesthetics set here bypass scales.
```ggsql
DRAW point
SETTING size => 5, opacity => 0.7, stroke => 'red'
```
**Position adjustment** is a special setting:
```ggsql
SETTING position => 'identity' -- no adjustment (default for most)
SETTING position => 'stack' -- stack (default for bar, histogram, area)
SETTING position => 'dodge' -- side by side (default for boxplot, violin)
SETTING position => 'jitter' -- random offset
```
### FILTER
SQL WHERE condition applied to layer data. Content is passed to the database:
```ggsql
DRAW point
FILTER sex = 'female' AND body_mass > 4000
```
### PARTITION BY
Additional grouping columns beyond mapped discrete aesthetics:
```ggsql
DRAW line
MAPPING Day AS x, Temp AS y
PARTITION BY Month
```
### ORDER BY
Controls record order (important for path layers):
```ggsql
DRAW path
ORDER BY timestamp
```
## PLACE clause
Creates annotation layers with literal values only (no data mappings). Supports tuples for multiple annotations.
```
PLACE <layer-type>
SETTING <aesthetic/param> => <value>, ...
```
```ggsql
PLACE point SETTING x => 5, y => 10, color => 'red'
PLACE rule SETTING y => 70, linetype => 'dotted'
PLACE text SETTING x => (34, 44), y => (66, 49), label => ('Mean = 34', 'Mean = 44')
```
## SCALE clause
Controls how data values are translated to aesthetic values. Sensible defaults are always provided.
```
SCALE <type> <aesthetic> FROM <input-range> TO <output-range> VIA <transform>
SETTING <param> => <value>, ...
RENAMING <value> => <label>, ...
```
All parts except `aesthetic` are optional.
### Scale types (optional, placed before aesthetic)
- `CONTINUOUS` — continuous numeric/temporal data
- `DISCRETE` — categorical/string data
- `BINNED` — bin continuous data into discrete groups (never auto-selected, must be explicit)
- `ORDINAL` — ordered discrete data (never auto-selected, must be explicit)
- `IDENTITY` — pass data through unchanged (no legend created)
If omitted, type is inferred from data.
### Aesthetic names
Use the base name: `x`, `y`, `fill`, `stroke`, `color` (sets both fill and stroke), `opacity`, `size`, `linewidth`, `linetype`, `shape`, `panel` (facet), `row`, `column`.
For position families (xmin/xmax/xend/ymin/ymax/yend), scale with the base name: `SCALE x ...`
### FROM (input range)
- Continuous: `FROM (min, max)` — use `null` to infer from data: `FROM (0,>
Create and use brand.yml files for consistent branding across Shiny apps and Quarto documents. Covers: (1) Creating new _brand.yml files, (2) Applying to Shiny (R and Python), (3) Using in Quarto, (4) Modifying existing files, and (5) Troubleshooting. Includes complete specifications and integration guides.
Creates a pull request from current changes, monitors GitHub CI, and debugs any failures until CI passes. Activate when the user says "create pr", "make a pr", "open pull request", "submit pr", "pr for these changes", or wants to get their current work into a reviewable PR. Assumes the project uses git, is hosted on GitHub, and has GitHub Actions CI with automated checks (lint, build, tests, etc.). Does NOT merge - stops when CI passes and provides the PR link.
Address PR review feedback by systematically working through every unresolved PR review thread on the current branch's PR - analyze each comment, make the requested code changes (with tests where useful), commit, and optionally reply and resolve.
Bulk resolve unresolved PR review threads on the current branch’s PR — typically after threads have been addressed manually or via /pr-threads-address
>
Guide for drafting issue closure and decline responses as an open-source package maintainer. Use when helping compose a reply that says \"no\" to a feature request, closes an issue as won't-fix, redirects a user to a different package, explains why a design choice is intentional, or otherwise declines or closes a community contribution. Also use when the maintainer needs to explain a deprecation, point out a user misunderstanding, or communicate an effort/scope tradeoff to a contributor.
>