Skip to main content
ClaudeWave
Skill1.1k estrellas del repoactualizado 24d ago

scoring-checks

The scoring-checks skill adds a new deterministic scoring check to evaluate a specific aspect of AI agent config quality. Use this when adding a scoring criterion that measures filesystem state (file existence, content structure, token counts), registering it in the constants and index files, and ensuring all point values reference predefined constants rather than hardcoded numbers. Do not use for UI changes or refactoring existing scoring logic.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/caliber-ai-org/ai-setup /tmp/scoring-checks && cp -r /tmp/scoring-checks/skills/scoring-checks ~/.claude/skills/scoring-checks
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Adding a Scoring Check

Add a new deterministic check that evaluates a single aspect of AI agent config quality. All checks must be filesystem-based with no network calls or LLM inference.

## Critical

- **Check must be deterministic**: Same filesystem state → same result every time. No randomness, no external APIs.
- **Point values come from constants.ts**: Every `earnedPoints` and `maxPoints` must reference `POINTS_*` from `src/scoring/constants.ts`. Do NOT hardcode numbers.
- **Always return `Check[]` array**: Export a function `check<Category>(dir: string): Check[]` where category is one of: `existence`, `quality`, `grounding`, `accuracy`, `freshness`, `bonus`.
- **Every check must have**: `id` (kebab-case, unique), `name`, `category`, `maxPoints`, `earnedPoints`, `passed`, `detail`, and optional `suggestion`/`fix`.
- **Fix object fields**: `action` (string describing what to do), `data` (context for the fix), `instruction` (user-facing guidance).
- **Register in src/scoring/index.ts**: Add the import and spread the result into the `allChecks` array in `computeLocalScore()`.
- **Target filtering**: If the check is platform-specific (Claude-only, Cursor-only, etc.), add its ID to the appropriate `*_ONLY_CHECKS` set in `constants.ts`.

## Instructions

### Step 1: Define point constants in src/scoring/constants.ts

Verify before proceeding: Is your check measurable with a numeric point value?

Add constants below the appropriate category section (existence, quality, grounding, accuracy, freshness, bonus):

```typescript
// In the appropriate CATEGORY section, e.g., Quality checks (25 pts):
export const POINTS_YOUR_CHECK_NAME = 4; // 1-12 pts typical

// If threshold-based, add a companion array:
export const YOUR_THRESHOLD_ARRAY = [
  { minValue: 10, points: 4 },
  { minValue: 5, points: 2 },
] as const;
```

Check existing patterns: Token budgets use `TOKEN_BUDGET_THRESHOLDS`, code blocks use `CODE_BLOCK_THRESHOLDS`, concreteness uses `CONCRETENESS_THRESHOLDS`.

Verify: Review `CATEGORY_MAX` object to ensure your check fits within its category's point budget.

### Step 2: Create or edit check function in src/scoring/checks/

Choose the file based on category. Each file exports a `check<Name>(dir: string): Check[]` function:

- `existence.ts` — files/directories exist (CLAUDE.md, .cursorrules, skills, MCP servers)
- `quality.ts` — config structure, size, clarity (code blocks, token budget, concreteness, duplicates)
- `grounding.ts` — references to actual project files/directory structure
- `accuracy.ts` — validity of references, git-based config drift
- `freshness.ts` — git commit-based staleness, secrets, permissions
- `bonus.ts` — hooks, learned content, OpenSkills format
- `sources.ts` — source configuration and usage

Create the function following this structure:

```typescript
import type { Check } from '../index.js';
import {
  POINTS_YOUR_CHECK,
  YOUR_THRESHOLD_ARRAY,
} from '../constants.js';
import { readFileOrNull } from '../utils.js'; // or other helpers

export function checkYourCategory(dir: string): Check[] {
  const checks: Check[] = [];

  // 1. Measure something concrete
  const yourMetric = /* e.g., countFiles(), validatePaths(), etc. */;
  const threshold = YOUR_THRESHOLD_ARRAY.find(t => yourMetric >= t.minValue);
  const earnedPts = threshold?.points ?? 0;

  checks.push({
    id: 'your_unique_check_id',
    name: 'Human-readable check name',
    category: 'quality', // matches function context
    maxPoints: POINTS_YOUR_CHECK,
    earnedPoints: earnedPts,
    passed: earnedPts >= Math.ceil(POINTS_YOUR_CHECK * 0.6), // or custom logic
    detail: `${earnedPts}/${POINTS_YOUR_CHECK} points — ${yourMetric} items found`,
    suggestion: earnedPts >= POINTS_YOUR_CHECK ? undefined : 'Action to improve',
    fix: earnedPts >= POINTS_YOUR_CHECK ? undefined : {
      action: 'verb_noun', // e.g., 'add_code_blocks', 'fix_references'
      data: { currentValue: yourMetric, targetValue: 10 },
      instruction: 'Specific, actionable guidance for the user.',
    },
  });

  return checks;
}
```

Verify ID uniqueness: Run `grep -r "'your_unique_check_id'" src/scoring/checks/` — should return only your new check.

### Step 3: Handle platform-specific filtering (if applicable)

If your check only applies to certain agents (Claude, Cursor, Codex, GitHub Copilot), register it in `src/scoring/constants.ts`:

```typescript
// Add to the appropriate set:
export const CLAUDE_ONLY_CHECKS = new Set([
  'claude_md_exists',
  'your_new_check_id', // ← add here
  'claude_rules_exist',
]);
```

Available sets (update exactly one if applicable):
- `CLAUDE_ONLY_CHECKS` — Claude Code targets
- `CURSOR_ONLY_CHECKS` — Cursor targets
- `CODEX_ONLY_CHECKS` — Codex/OpenCode targets
- `COPILOT_ONLY_CHECKS` — GitHub Copilot targets
- `BOTH_ONLY_CHECKS` — Both Claude AND Cursor (cross-platform parity)
- `NON_CODEX_CHECKS` — Everything except Codex/OpenCode
- `CLAUDE_OR_CODEX_CHECKS` — Claude OR Codex

Verify filtering: Examine `filterChecksForTarget()` in `src/scoring/index.ts` to ensure your category will work correctly for your target agents.

### Step 4: Register in src/scoring/index.ts

Import your function at the top:

```typescript
import { checkYourCategory } from './checks/your-file.js';
```

Add to `computeLocalScore()` inside the `allChecks` array initialization:

```typescript
export function computeLocalScore(dir: string, targetAgent?: TargetAgent): ScoreResult {
  const target = targetAgent ?? detectTargetAgent(dir);

  const allChecks: Check[] = [
    ...checkExistence(dir),
    ...checkQuality(dir),
    ...checkGrounding(dir),
    ...checkAccuracy(dir),
    ...checkYourCategory(dir), // ← ADD HERE IN ORDER
    ...checkFreshness(dir),
    ...checkBonus(dir),
    ...checkSources(dir),
  ];
  // ... rest of function
}
```

Verify registration: Run `npm test src/scoring/__tests__/accuracy.test.ts` (or similar) — all existing tests should still pass.

### Step 5: Write deterministic unit te
adding-a-commandSkill

Creates a new CLI command following the Commander.js pattern in src/commands/. Handles command registration in src/cli.ts, telemetry tracking via tracked() wrapper, and option parsing. Use when user says add command, new CLI command, create subcommand, or adds files to src/commands/. Do NOT use for modifying existing commands or fixing bugs in existing commands.

caliber-testingSkill

Writes Vitest tests following project patterns: __tests__/ directories, vi.mock() for module mocking with vi.hoisted() for test-time factories, global LLM mock from src/test/setup.ts, environment variable save/restore in beforeEach/afterEach, vi.clearAllMocks() lifecycle, and test file organization. Use when user says 'write tests', 'add test coverage', 'test this', creates *.test.ts files, or when test failures appear in CI. Do NOT use for non-test code or for debugging without writing tests.

find-skillsSkill

Discovers and installs community skills from the public registry. Use when the user mentions a technology, framework, or task that could benefit from specialized skills not yet installed, asks 'how do I do X', 'find a skill for X', or starts work in a new technology area. Proactively suggest when the user's task involves tools or frameworks without existing skills.

llm-providerSkill

Adds a new LLM provider implementing LLMProvider interface with call() and stream() methods. Integrates with provider factory in src/llm/index.ts, config detection in src/llm/config.ts, and error handling via tracking and recovery. Use when adding a new model backend, integrating a third-party LLM API, or extending LLM platform support. Do NOT use for fixing bugs in existing providers, modifying existing provider behavior, or changing the LLMProvider interface.

save-learningSkill

Saves user instructions as persistent learnings for future sessions. Use when the user says 'remember this', 'always do X', 'from now on', 'never do Y', or gives any instruction they want persisted across sessions. Proactively suggest when the user states a preference, convention, or rule they clearly want followed in the future.

setup-caliberSkill

Sets up Caliber for automatic AI agent context sync. Installs pre-commit hooks so CLAUDE.md, Cursor rules, and Copilot instructions update automatically on every commit. Use when Caliber hooks are not yet installed or when the user asks about keeping agent configs in sync.

writers-patternSkill

Add a new platform writer module in src/writers/ that generates and writes agent config files for a supported platform. Each writer exports a function that accepts a config interface, creates directories (rules/, skills/, mcp configs), writes files with proper formatting and frontmatter, and returns string[] of written file paths. Use when adding platform support for a new agent, integrating a new code AI tool, or extending caliber to support new targets. Do NOT use for modifying existing writers, refactoring scoring logic, or changing how writers are invoked.