Skill284 repo starsupdated 4d ago

sciagent-skill-creator

The sciagent-skill-creator mechanizes boilerplate generation for new SciAgent skill entries, automating fields like metadata and file structure while preserving authoring effort for substantive content like workflows and recipes. Use this when adding a skill for a specific tool, library, database, or guide topic to a SciAgent-Skills repository, rather than hand-editing scaffold files.

View source Repository: SciAgent-Skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/jaechang-hits/SciAgent-Skills /tmp/sciagent-skill-creator && cp -r /tmp/sciagent-skill-creator/.claude/skills/sciagent-skill-creator ~/.claude/skills/sciagent-skill-creator

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# SciAgent Skill Creator

Repo-local scaffolder for `skills/` entries. Mechanizes the boilerplate from `CLAUDE.md` Steps 1, 2, 4, 5, 6 so authoring effort stays on *content* (When to Use, Workflow, Recipes, References) and not on field plumbing.

## When to invoke this skill

- User asks for a new SciAgent skill entry on a specific tool, library, database, or guide topic
- User invokes `/sciagent-skill-creator` directly
- The agent is about to hand-edit `registry.yaml` and create a `skills/<cat>/<name>/SKILL.md` from scratch — use this instead

Do **not** invoke for:
- Editing an existing entry's content (just edit the file)
- Migrating an existing entry (read `CLAUDE.md` "Migrating from Existing Entries" first — the scaffolder generates a skeleton, but migration requires content judgment)
- Updating `registry.yaml` only (use a normal edit)

## What you need to collect from the user

Before calling the scaffold script, gather these — in conversation, not via flags hidden from the user:

1. **Topic** — concrete tool/library/concept name. Reject vague topics ("ML stuff") with a clarifying question.
2. **Sub-type** — `pipeline` | `toolkit` | `database` | `guide`. Use the decision rule from CLAUDE.md Step 1b. If unsure, ask the user.
3. **Category** — primary category directory. List the table from `CLAUDE.md` Step 2 if the user is unsure.
4. **Entry name** — kebab-case slug. Convention: `{tool-name}-{purpose}` (e.g., `pydeseq2-differential-expression`). Confirm with the user.
5. **License** — underlying tool's license. Default to `CC-BY-4.0` for original prose-only content.
6. **Description** — 1-2 sentences, max 1024 chars. Lead with tool/domain keyword in the first 120 chars. Anti-patterns are in CLAUDE.md Step 5 "Description writing rules".
7. **Tags** (optional) — only if the entry meaningfully spans multiple categories (e.g., literature DB stored under `scientific-writing`, tag with `["databases", "literature"]`).

## Duplicate check before scaffolding

Before calling the scaffold script, search the registry and `legacy/` for similar names:

```bash
grep -i "<topic-keyword>" registry.yaml
ls legacy/ | grep -i "<topic-keyword>"
```

If a near-duplicate exists, surface it to the user before continuing. Authoring a parallel entry usually means the existing one needs updating, not duplication.

## How to run the scaffolder

Call `scripts/scaffold.py` with explicit arguments. The script is **non-interactive** — the agent provides all values:

```bash
python .claude/skills/sciagent-skill-creator/scripts/scaffold.py \
  --sub-type pipeline \
  --category genomics-bioinformatics \
  --name my-tool-purpose \
  --description "MyTool short-form description starting with the tool name. Brief on inputs, outputs, when to pick this over alternatives." \
  --license MIT \
  --tags databases,literature   # optional, comma-separated
```

Behavior:

1. Validates name (kebab-case, not already in `registry.yaml`, not in `legacy/`)
2. Validates category exists as a directory under `skills/`
3. Validates description with `validate_description.py` (length + first-120-char keyword lead)
4. Validates tags (kebab-case if provided)
5. Creates `skills/{category}/{name}/SKILL.md` from the matching template, substituting frontmatter fields
6. Appends a new entry to `registry.yaml` with `date_added` = today (UTC)
7. Runs `pixi run validate` to confirm the registry is still well-formed
8. Prints next steps (fill in Overview, Workflow, Recipes, References)

On any validation failure, the script aborts without writing anything. Fix the offending value and re-run.

## After scaffolding

The generated SKILL.md is a **skeleton with placeholders**. The agent's remaining job:

1. Fill `Overview`, `When to Use`, `Prerequisites`, `Workflow` / `Core API` / `Key Concepts`, `Common Recipes`, `Troubleshooting`, `References`
2. Match the section structure required by the sub-type (see CLAUDE.md Step 4 format rules)
3. Run `pixi run test` — full suite, not just `validate` — to catch sub-type-specific structural failures (code block counts, table row counts, section presence)

The scaffold script does not pretend to write content. Content stays with the agent and the source material.

## Content authoring rules (what NOT to bake into a SKILL.md)

Skills document a tool's *analysis surface*, not the consumer's *house style*. A SKILL.md is read by many agents for many downstream tasks — visual choices that fit one analysis brief leak into every future invocation. Strip the following before committing:

- **Color palettes, cmaps, themes** — no hex codes (`#08306b`), no `LinearSegmentedColormap.from_list(...)`, no `ListedColormap([...])`, no prescribed `cmap=` arguments unless the cmap *is* the tool's API (e.g., a tool that ships its own palette). Let matplotlib pick defaults; the consumer overrides downstream.
- **Per-replicate / per-condition color dicts** — e.g., `colors = {"rep1": "#1f77b4", ...}`. Matplotlib auto-cycles colors.
- **Font choices, dpi presets, figure sizes tuned for one report** — `figsize=(8, 4)` for a routine line plot is fine; `figsize=(12, 4)` chosen to fit a slide deck is not.
- **One-shot user-brief specifics** — if the user asked for "blue for low, red for high" in *their* analysis, that belongs in their code, not the skill. The skill teaches *how to compute* phi/psi density; *how to color it* is consumer choice.
- **Hardcoded paths beyond the tool's defaults** — `"figures/"`, `"results/"`, `f"{pdb_id}_protein.pdb"` are fine as illustrative outputs; `"/Users/me/proj42/output"` is not.

What to keep: the analysis logic, the data shape, the units, the parameter semantics, the expected output *structure* (columns, axes, units), and any visual choice the tool itself enforces.

Rule of thumb: if a downstream consumer would *override* the choice, don't ship the choice in the skill.

## Writing style: be succinct

A SKILL.md is reference material for agents, not a tutorial. Token cost matters — every line is paid for