CI Pipeline Optimizer
Optimize CI test pipelines through intelligent test splitting, parallelization, caching strategies, and selective test execution based on code changes.
git clone --depth 1 https://github.com/PramodDutta/qaskills /tmp/ci-pipeline-optimizer && cp -r /tmp/ci-pipeline-optimizer/seed-skills/ci-pipeline-optimizer ~/.claude/skills/ci-pipeline-optimizerSKILL.md
# CI Pipeline Optimizer
Slow CI pipelines are one of the most common productivity killers in software teams. A test suite that takes 30 minutes to run means developers context-switch away from their work, batch multiple changes into fewer PRs to avoid waiting, and eventually start skipping CI altogether. This skill addresses CI pipeline performance through four complementary strategies: intelligent test splitting across parallel workers, selective test execution based on code changes, aggressive caching of dependencies and build artifacts, and pipeline architecture that minimizes total wall-clock time. The techniques apply to any CI system but use GitHub Actions as the primary example, with patterns that transfer to GitLab CI, CircleCI, Jenkins, and other platforms.
## Core Principles
### 1. Wall-Clock Time Is the Only Metric That Matters
CPU time, billable minutes, and total test count are secondary metrics. The developer waits for wall-clock time. A pipeline that uses 60 CPU-minutes across 10 parallel workers in 6 minutes is vastly preferable to a pipeline that uses 20 CPU-minutes sequentially in 20 minutes. Optimize for the duration the developer experiences.
### 2. Never Run Tests You Do Not Need
If a PR changes only documentation files, running the full test suite is waste. If a PR modifies only the frontend, running backend integration tests is waste. Selective test execution identifies the minimum set of tests needed to validate a specific change, with a safety net that defaults to running everything when the analysis is uncertain.
### 3. Cache Everything That Does Not Change
Dependencies, build artifacts, Docker layers, and browser binaries are the same across most CI runs. Downloading and building them from scratch on every run is unnecessary. Aggressive caching can eliminate minutes from every pipeline execution.
### 4. Split Tests by Duration, Not by File Count
Naive test splitting distributes files evenly across workers. But if one file contains a 5-minute test and another contains a 5-second test, the distribution is severely unbalanced. Intelligent splitting uses historical duration data to distribute tests so that all workers finish at approximately the same time.
### 5. Fail Fast, Report Completely
Run the fastest tests first. If unit tests catch a bug in 30 seconds, there is no reason to wait 10 minutes for e2e tests to also catch it. Structure the pipeline so the cheapest checks (lint, type-check) run first to provide the fastest possible feedback, but still collect complete results from all shards for thorough reporting.
## Project Structure
```
ci-config/
scripts/
split-tests.ts
detect-changes.ts
cache-manager.ts
timing-collector.ts
pipeline-analyzer.ts
config/
test-groups.json
change-map.json
cache-config.json
.github/
workflows/
ci-optimized.yml
ci-selective.yml
cache-warmup.yml
package.json
tsconfig.json
```
The `scripts/` directory contains automation tools for test splitting, change detection, cache management, and pipeline analysis. The `config/` directory holds the mapping from file changes to test groups, cache key definitions, and test group specifications.
## Test Splitting Strategies
### Duration-Based Test Splitting
Split tests across parallel workers based on historical execution times so that each worker runs for approximately the same duration:
```typescript
// scripts/split-tests.ts
import { readFileSync, writeFileSync, existsSync } from 'fs';
import { execSync } from 'child_process';
interface TestTiming {
file: string;
duration: number;
lastRun: string;
}
interface SplitResult {
shardIndex: number;
files: string[];
estimatedDuration: number;
}
export function splitTestsByDuration(
testFiles: string[],
shardCount: number,
timingsFile: string = 'test-timings.json'
): SplitResult[] {
// Load historical timings
const timings = loadTimings(timingsFile);
// Create a list of files with their estimated duration
const fileTimings: Array<{ file: string; duration: number }> = testFiles.map((file) => ({
file,
duration: timings.get(file) || estimateDefaultDuration(file),
}));
// Sort by duration descending (greedy algorithm: assign largest jobs first)
fileTimings.sort((a, b) => b.duration - a.duration);
// Initialize shards
const shards: SplitResult[] = Array.from({ length: shardCount }, (_, i) => ({
shardIndex: i,
files: [],
estimatedDuration: 0,
}));
// Greedy assignment: always assign to the shard with the least total duration
for (const fileTiming of fileTimings) {
const lightest = shards.reduce((min, shard) =>
shard.estimatedDuration < min.estimatedDuration ? shard : min
);
lightest.files.push(fileTiming.file);
lightest.estimatedDuration += fileTiming.duration;
}
return shards;
}
function loadTimings(timingsFile: string): Map<string, number> {
const timings = new Map<string, number>();
if (!existsSync(timingsFile)) {
return timings;
}
try {
const data: TestTiming[] = JSON.parse(readFileSync(timingsFile, 'utf-8'));
for (const entry of data) {
timings.set(entry.file, entry.duration);
}
} catch {
console.warn(`Failed to parse timings file: ${timingsFile}`);
}
return timings;
}
function estimateDefaultDuration(file: string): number {
// Heuristic estimates based on test type when no historical data exists
if (file.includes('.e2e.') || file.includes('e2e/')) return 30000;
if (file.includes('.integration.') || file.includes('integration/')) return 10000;
if (file.includes('.spec.')) return 5000;
return 3000; // Default estimate for unit tests
}
// CLI entry point for use in GitHub Actions
function main(): void {
const shardIndex = parseInt(process.env.SHARD_INDEX || '0', 10);
const shardCount = parseInt(process.env.SHARD_COUNT || '1', 10);
// Discover all test files
const testFilesOutput = execSync(
'find . -name "*.teAutomated accessibility testing with axe-core integrated into CI pipelines, including custom rule configuration, issue prioritization, and remediation guidance.
Validating A/B test implementations including traffic splitting accuracy, statistical significance calculation, metric tracking, and experiment cleanup.
Comprehensive WCAG compliance and accessibility testing covering ARIA, keyboard navigation, screen readers, color contrast, and automated a11y validation.
Comprehensive WCAG 2.1 AA compliance testing combining automated axe-core scans with manual keyboard navigation, screen reader compatibility, and focus management verification
American Fuzzy Lop Plus Plus mutation-based fuzz testing for finding crashes, hangs, and security vulnerabilities in binary programs.
Fast Rust-based headless browser automation CLI with Node.js fallback for AI agents, featuring navigation, clicking, typing, snapshots, and structured commands optimized for agent workflows.
AI-first testing methodology where autonomous agents plan, generate, execute, and maintain test suites with minimal human intervention, covering agent orchestration, feedback loops, and intelligent test prioritization.
Comprehensive evaluation patterns for AI agents including multi-turn conversation testing, LLM-as-judge frameworks, benchmark suites, regression detection, and systematic eval pipelines for measuring agent quality and safety.