Skill851 repo starsupdated yesterday

skill-creator

The skill-creator autonomously detects reusable patterns in work history (git diffs, attempt logs, tool sequences, notes) and packages them as reusable skills without human oversight. Use it when you notice repeated tool combinations, recurring code patterns across multiple attempts, or documented insights that should become packaged, testable skills for future reuse.

View source Repository: CORAL

Install in Claude Code

Copy

git clone --depth 1 https://github.com/Human-Agent-Society/CORAL /tmp/skill-creator && cp -r /tmp/skill-creator/coral/template/skills/skill-creator ~/.claude/skills/skill-creator

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Skill Creator (Autonomous)

Create skills by analyzing your own work patterns — you are both creator and evaluator. No human input required at any step.

**Core loop:** analyze context → draft SKILL.md → generate test cases → run + grade → iterate → optimize description → package

---

## 1. Context Analysis

Before drafting, identify what skill to build and confirm it doesn't already exist.

### Pattern Detection

Scan these sources for repeated, reusable patterns:

- **Git diffs**: `git log --stat -10` and `git diff HEAD~5` — look for repeated file types, similar transformations, recurring helper scripts written independently across commits
- **Attempt history**: Read `.coral/attempts/` JSON files — which approaches recur? What tool sequences appear in multiple successful attempts?
- **Tool usage**: Review your own transcript — sequences of 3+ tool calls that repeat across tasks are skill candidates
- **Cross-episode notes**: Run `coral notes --read all` — patterns under "Patterns That Work" not yet captured as skills are prime candidates
- **Sibling techniques**: Check `.coral/graph_state/state.yaml` `siblings:` — if multiple agents converged on the same technique independently, it deserves a skill

### Deduplication Check

Before creating a new skill, check existing skills:

```
coral skills
```

Read each relevant `SKILL.md` frontmatter. If an existing skill has 70%+ overlap with your candidate, **update that skill** instead of creating a new one.

### Output

Produce a structured spec before writing:

```
Skill name: <kebab-case>
Purpose: <what it enables, one sentence>
Triggers: <when should this skill activate>
Output format: <what the skill produces>
Source evidence: <which patterns/diffs/insights led to this>
```

---

## 2. Write the SKILL.md

Based on your context analysis, draft the skill.

### Skill Writing Guide

#### Anatomy of a Skill

```
skill-name/
├── SKILL.md (required)
│   ├── YAML frontmatter (name, description required)
│   └── Markdown instructions
└── Bundled Resources (optional)
    ├── scripts/    - Executable code for deterministic/repetitive tasks
    ├── references/ - Docs loaded into context as needed
    └── assets/     - Files used in output (templates, icons, fonts)
```

#### Progressive Disclosure

Skills use a three-level loading system:
1. **Metadata** (name + description) - Always in context (~100 words)
2. **SKILL.md body** - In context whenever skill triggers (<500 lines ideal)
3. **Bundled resources** - As needed (unlimited, scripts can execute without loading)

These word counts are approximate and you can feel free to go longer if needed.

**Key patterns:**
- Keep SKILL.md under 500 lines; if you're approaching this limit, add an additional layer of hierarchy along with clear pointers about where the model using the skill should go next to follow up.
- Reference files clearly from SKILL.md with guidance on when to read them
- For large reference files (>300 lines), include a table of contents

**Domain organization**: When a skill supports multiple domains/frameworks, organize by variant:
```
cloud-deploy/
├── SKILL.md (workflow + selection)
└── references/
    ├── aws.md
    ├── gcp.md
    └── azure.md
```
Claude reads only the relevant reference file.

#### Principle of Lack of Surprise

Skills must not contain malware, exploit code, or any content that could compromise system security. A skill's contents should not surprise the user in their intent if described.

#### Naming and Description

- Use kebab-case for skill name and directory
- The `description` field is the primary triggering mechanism. Include both what the skill does AND specific contexts for when to use it
- Make descriptions slightly "pushy" to combat under-triggering. Instead of "How to build a dashboard", write "How to build a dashboard. Use this skill whenever the user mentions dashboards, data visualization, internal metrics, or wants to display any kind of data, even if they don't explicitly ask for a 'dashboard.'"
- The description will be programmatically optimized in step 7 — write a good first draft but don't agonize over it

#### Writing Patterns

Prefer using the imperative form in instructions.

**Defining output formats:**
```markdown
## Report structure
ALWAYS use this exact template:
# [Title]
## Executive summary
## Key findings
## Recommendations
```

**Examples pattern:**
```markdown
## Commit message format
**Example 1:**
Input: Added user authentication with JWT tokens
Output: feat(auth): implement JWT-based authentication
```

#### Writing Style

Explain to the model **why** things are important rather than relying on heavy-handed MUSTs. Use theory of mind and make the skill general rather than narrow to specific examples. Write a draft, then review it with fresh eyes and improve it.

---

## 3. Generate Test Cases

Create 3-5 test cases derived from the real contexts that triggered your pattern detection.

### Test Case Design

- **Simple case**: The canonical, straightforward application of the skill
- **Complex case**: Multiple interacting aspects, larger input, more steps
- **Edge case**: Unusual input, boundary conditions, minimal context
- **Counter-examples** (1-2): Near-miss scenarios where the skill should NOT apply — these prevent overfitting

### Assertions

Write 2-4 assertions per test case upfront. Good assertions are:
- **Objectively verifiable** — a script or grader can check them unambiguously
- **Discriminating** — they should fail without the skill (or with a bad skill) and pass with a good one
- **Descriptive** — assertion text should read clearly in benchmark output

### Save to evals/evals.json

```json
{
  "skill_name": "my-skill",
  "evals": [
    {
      "id": 1,
      "prompt": "Task prompt derived from real pattern",
      "expected_output": "Description of expected result",
      "files": [],
      "expectations": [
        "Output file exists and contains valid JSON",
        "All required fields are present",
        "Pro

More from this repository

coral-debugSkill

Verify and debug changes to CORAL itself — smallest reproduce loop per area (grader / daemon / CLI / hooks / manager / workspace / hub / template / config / web), where to look when something breaks (hung graders, agent restart loops, stalled agents, missing heartbeat actions, corrupted shared state, broken worktree symlinks, grader import errors, wrong-task resume), how to inspect a live or finished run under `.coral/public/`, and the canonical lint/test commands. Use when editing code under `coral/` or chasing a CORAL bug, NOT when adding a new task or extending the framework.

coral-extendSkill

Add a new component to the CORAL framework itself — a new agent runtime under `coral/agent/builtin/` (claude_code/codex/cursor_agent style), a new CLI command in `coral/cli/`, a new bundled skill or subagent template under `coral/template/skills/` or `coral/template/agents/`, a new hook in `coral/hooks/`, a new field in `coral/config.py`, or a framework-level extension to the grader stack under `coral/grader/`. NOT for writing a per-task grader or adding an example task — use `coral-new-task` for that. NOT for debugging existing code — use `coral-debug`.

coral-new-taskSkill

End-to-end recipe for adding a new task under `examples/` — the three pieces that have to line up (`task.yaml`, `seed/`, and `grader/`), what to put in each, the `TaskGrader` API surface, the `coral validate` → smoke-test loop, and the common mistakes (repo_path pointing at the wrong dir, score direction backwards, hidden answer keys leaking into seed/, grader writing to codebase_path which the daemon force-removes, private-vs-public confusion, missing `run()` signature). Use whenever the user wants to add a new CORAL task or port an existing benchmark into CORAL.

deep-researchSkill

Research the problem domain before coding. Web search for techniques, save raw sources, write structured findings, update the index.

organize-filesSkill

Organize the shared notes directory when it becomes hard to navigate. Restructure within research/ and experiments/, deduplicate, update index.md.

coral-quickstartSkill

The fast path from zero to a running CORAL experiment — what CORAL is and when to reach for it, installing the `coral` CLI, registering a runtime with `coral setup`, and the `.coral_workspace/` convention for pointing CORAL at code you already have and want optimized. Use this whenever the user asks "what is coral", "should I use coral for this", wants to install or get coral set up, hits a "command not found" for coral or doesn't have it installed yet, or says "use coral to optimize / speed up / improve this code" and you need the end-to-end onboarding from install to a launched run. Hands off to `setting-up-coral` (runtime bindings), `creating-a-coral-task` (grader authoring), and `running-coral-experiments` (operating a run) for depth.

creating-a-coral-taskSkill

Author a new CORAL task — the three pieces that must line up (`task.yaml`, `seed/`, a packaged `grader/`), the `coral init` → `coral validate` → smoke-test loop, and how to pick a grader pattern (stdout float, test pass-rate, ratio-vs-baseline, multi-metric, or an LLM rubric judge). Use whenever the user wants to create a CORAL task, write or wire a grader, port a benchmark into CORAL, score open-ended outputs (reports/memos) with a judge, or debug a grader that crashes on the seed / ranks the leaderboard backwards / leaks the answer key. Deep references for the TaskGrader API, grader patterns, rubric judges, and the full task.yaml schema live alongside this skill.

running-coral-experimentsSkill

Run and manage CORAL experiments from the operator side — launch agents with `coral start` (dotlist overrides, model/count, tmux vs local), monitor with `coral status` / `coral log` / `coral show` / the web dashboard, and drive the loop with `coral resume` (inject instructions, fork from an attempt), `coral heartbeat` (tune reflection cadence), and `coral stop`. Use whenever the user wants to start a CORAL run, check on agents, read scores/leaderboard, steer or resume a run, diagnose agents that keep restarting or fail every eval, scale to more agents or islands, or stop a run. Deep references for steering/heartbeat tuning and scaling/troubleshooting live alongside this skill.