testing-r-packages
This Claude Code skill provides comprehensive guidance on modern R package testing practices using testthat version 3 and later. Use this skill when writing, organizing, or improving test suites for R packages, including establishing test file structure, implementing various expectation types, creating fixtures and helper functions, applying snapshot testing for complex outputs, using mocking to isolate functionality, and leveraging behavior-driven development patterns with describe and it blocks for organizing test hierarchies.
git clone --depth 1 https://github.com/posit-dev/skills /tmp/testing-r-packages && cp -r /tmp/testing-r-packages/r-lib/testing-r-packages ~/.claude/skills/testing-r-packagesSKILL.md
# Testing R Packages with testthat
Modern best practices for R package testing using testthat 3+.
## Initial Setup
Initialize testing with testthat 3rd edition:
```r
usethis::use_testthat(3)
```
This creates `tests/testthat/` directory, adds testthat to `DESCRIPTION` Suggests with `Config/testthat/edition: 3`, and creates `tests/testthat.R`.
## File Organization
**Mirror package structure:**
- Code in `R/foofy.R` → tests in `tests/testthat/test-foofy.R`
- Use `usethis::use_r("foofy")` and `usethis::use_test("foofy")` to create paired files
**Special files:**
- `helper-*.R` - Helper functions and custom expectations, sourced before tests
- `setup-*.R` - Run during `R CMD check` only, not during `load_all()`
- `fixtures/` - Static test data files accessed via `test_path()`
## Test Structure
Tests follow a three-level hierarchy: **File → Test → Expectation**
### Standard Syntax
```r
test_that("descriptive behavior", {
result <- my_function(input)
expect_equal(result, expected_value)
})
```
**Test descriptions** should read naturally and describe behavior, not implementation.
### BDD Syntax (describe/it)
For behavior-driven development, use `describe()` and `it()`:
```r
describe("matrix()", {
it("can be multiplied by a scalar", {
m1 <- matrix(1:4, 2, 2)
m2 <- m1 * 2
expect_equal(matrix(1:4 * 2, 2, 2), m2)
})
it("can be transposed", {
m <- matrix(1:4, 2, 2)
expect_equal(t(m), matrix(c(1, 3, 2, 4), 2, 2))
})
})
```
**Key features:**
- `describe()` groups related specifications for a component
- `it()` defines individual specifications (like `test_that()`)
- Supports nesting for hierarchical organization
- `it()` without code creates pending test placeholders
**Use `describe()` to verify you implement the right things, use `test_that()` to ensure you do things right.**
See [references/bdd.md](references/bdd.md) for comprehensive BDD patterns, nested specifications, and test-first workflows.
## Running Tests
Three scales of testing:
**Micro** (interactive development):
```r
devtools::load_all()
expect_equal(foofy(...), expected)
```
**Mezzo** (single file):
```r
testthat::test_file("tests/testthat/test-foofy.R")
# RStudio: Ctrl/Cmd + Shift + T
```
**Macro** (full suite):
```r
devtools::test() # Ctrl/Cmd + Shift + T
devtools::check() # Ctrl/Cmd + Shift + E
```
## Core Expectations
### Equality
```r
expect_equal(10, 10 + 1e-7) # Allows numeric tolerance
expect_identical(10L, 10L) # Exact match required
expect_all_equal(x, expected) # Every element matches (v3.3.0+)
```
### Errors, Warnings, Messages
```r
expect_error(1 / "a")
expect_error(bad_call(), class = "specific_error_class")
expect_no_error(valid_call())
expect_warning(deprecated_func())
expect_no_warning(safe_func())
expect_message(informative_func())
expect_no_message(quiet_func())
```
### Pattern Matching
```r
expect_match("Testing is fun!", "Testing")
expect_match(text, "pattern", ignore.case = TRUE)
```
### Structure and Type
```r
expect_length(vector, 10)
expect_type(obj, "list")
expect_s3_class(model, "lm")
expect_s4_class(obj, "MyS4Class")
expect_r6_class(obj, "MyR6Class") # v3.3.0+
expect_shape(matrix, c(10, 5)) # v3.3.0+
```
### Sets and Collections
```r
expect_setequal(x, y) # Same elements, any order
expect_contains(fruits, "apple") # Subset check (v3.2.0+)
expect_in("apple", fruits) # Element in set (v3.2.0+)
expect_disjoint(set1, set2) # No overlap (v3.3.0+)
```
### Logical
```r
expect_true(condition)
expect_false(condition)
expect_all_true(vector > 0) # All elements TRUE (v3.3.0+)
expect_all_false(vector < 0) # All elements FALSE (v3.3.0+)
```
## Design Principles
### 1. Self-Sufficient Tests
Each test should contain all setup, execution, and teardown code:
```r
# Good: self-contained
test_that("foofy() works", {
data <- data.frame(x = 1:3, y = letters[1:3])
result <- foofy(data)
expect_equal(result$x, 1:3)
})
# Bad: relies on ambient state
dat <- data.frame(x = 1:3, y = letters[1:3])
test_that("foofy() works", {
result <- foofy(dat) # Where did 'dat' come from?
expect_equal(result$x, 1:3)
})
```
### 2. Self-Contained Tests (Cleanup Side Effects)
Use `withr` to manage state changes:
```r
test_that("function respects options", {
withr::local_options(my_option = "test_value")
withr::local_envvar(MY_VAR = "test")
withr::local_package("jsonlite")
result <- my_function()
expect_equal(result$setting, "test_value")
# Automatic cleanup after test
})
```
**Common withr functions:**
- `local_options()` - Temporarily set options
- `local_envvar()` - Temporarily set environment variables
- `local_tempfile()` - Create temp file with automatic cleanup
- `local_tempdir()` - Create temp directory with automatic cleanup
- `local_package()` - Temporarily attach package
### 3. Plan for Test Failure
Write tests assuming they will fail and need debugging:
- Tests should run independently in fresh R sessions
- Avoid hidden dependencies on earlier tests
- Make test logic explicit and obvious
### 4. Repetition is Acceptable
Repeat setup code in tests rather than factoring it out. Test clarity is more important than avoiding duplication.
### 5. Use `devtools::load_all()` Workflow
During development:
- Use `devtools::load_all()` instead of `library()`
- Makes all functions available (including unexported)
- Automatically attaches testthat
- Eliminates need for `library()` calls in tests
## Snapshot Testing
For complex output that's difficult to verify programmatically, use snapshot tests. See [references/snapshots.md](references/snapshots.md) for complete guide.
**Basic pattern:**
```r
test_that("error message is helpful", {
expect_snapshot(
error = TRUE,
validate_input(NULL)
)
})
```
Snapshots stored in `tests/testthat/_snaps/`.
**Workflow:**
```r
devtools::test() # Creates new snapshots
testthat::snapshot_review('name') # Review changes
tes>
Create and use brand.yml files for consistent branding across Shiny apps and Quarto documents. Covers: (1) Creating new _brand.yml files, (2) Applying to Shiny (R and Python), (3) Using in Quarto, (4) Modifying existing files, and (5) Troubleshooting. Includes complete specifications and integration guides.
Write ggsql queries — a grammar of graphics for SQL. Use when the user wants to create, modify, or understand a ggsql visualization query.
Creates a pull request from current changes, monitors GitHub CI, and debugs any failures until CI passes. Activate when the user says "create pr", "make a pr", "open pull request", "submit pr", "pr for these changes", or wants to get their current work into a reviewable PR. Assumes the project uses git, is hosted on GitHub, and has GitHub Actions CI with automated checks (lint, build, tests, etc.). Does NOT merge - stops when CI passes and provides the PR link.
Address PR review feedback by systematically working through every unresolved PR review thread on the current branch's PR - analyze each comment, make the requested code changes (with tests where useful), commit, and optionally reply and resolve.
Bulk resolve unresolved PR review threads on the current branch’s PR — typically after threads have been addressed manually or via /pr-threads-address
>
Guide for drafting issue closure and decline responses as an open-source package maintainer. Use when helping compose a reply that says \"no\" to a feature request, closes an issue as won't-fix, redirects a user to a different package, explains why a design choice is intentional, or otherwise declines or closes a community contribution. Also use when the maintainer needs to explain a deprecation, point out a user misunderstanding, or communicate an effort/scope tradeoff to a contributor.