Skip to main content
ClaudeWave
Skill894 repo starsupdated 2d ago

project-development

# ClaudeWave The project-development skill guides decisions about LLM system architecture at the pipeline level, including whether an LLM suits the task, pipeline shape design, token and cost estimation, single versus multi-agent choices, and structured output contracts for downstream stages. Apply this skill when architecting entire projects or multi-stage pipelines rather than optimizing individual tools, agents, or context windows.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/guanyang/open-agent-hub /tmp/project-development && cp -r /tmp/project-development/skills/project-development ~/.claude/skills/project-development
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Project Development Methodology

This skill covers the principles for identifying tasks suited to LLM processing, designing effective project architectures, and iterating rapidly using agent-assisted development. The methodology applies whether building a batch processing pipeline, a multi-agent research system, or an interactive agent application.

The unit of work for this skill is the whole project or a multi-stage pipeline. Individual tool design (descriptions, schemas, error messages) belongs to `tool-design`. Per-skill activation routing belongs to the corresponding skill plus the corpus index. This skill owns the project-level questions: should you build this with an LLM at all, what shape should the pipeline take, what does it cost, how should it be iterated.

## When to Activate

Activate this skill when the unit of work is a whole project or pipeline:

- Deciding whether an LLM is the right primitive for a task at all (task-model fit before any code).
- Shaping a multi-stage batch or agent pipeline (acquire / prepare / process / parse / render).
- Estimating tokens, dollar cost, and timelines for an LLM-heavy project.
- Choosing between single-agent and multi-agent at the project level.
- Structuring agent-assisted iteration (where the agent helps build the project itself).
- Designing structured output at the pipeline contract level (cross-stage handoff format).

Do not activate this skill for adjacent work owned by other skills:

- Per-tool description, schema, naming, response format, error message: `tool-design`.
- Per-trajectory token-efficiency tactics (masking, partitioning, caching): `context-optimization`.
- Deciding to split work across sub-agents at the agent topology level: `multi-agent-patterns`.
- Designing the autonomous control loop (locked metrics, novelty gates, human approval boundaries): `harness-engineering`.

## Core Concepts

### Task-Model Fit Recognition

Evaluate task-model fit before writing any code, because building automation on a fundamentally mismatched task wastes days of effort. Run every proposed task through these two tables to decide proceed-or-stop.

**Proceed when the task has these characteristics:**

| Characteristic | Rationale |
|----------------|-----------|
| Synthesis across sources | LLMs combine information from multiple inputs better than rule-based alternatives |
| Subjective judgment with rubrics | Grading, evaluation, and classification with criteria map naturally to language reasoning |
| Natural language output | When the goal is human-readable text, LLMs deliver it natively |
| Error tolerance | Individual failures do not break the overall system, so LLM non-determinism is acceptable |
| Batch processing | No conversational state required between items, which keeps context clean |
| Domain knowledge in training | The model already has relevant context, reducing prompt engineering overhead |

**Stop when the task has these characteristics:**

| Characteristic | Rationale |
|----------------|-----------|
| Precise computation | Math, counting, and exact algorithms are unreliable in language models |
| Real-time requirements | LLM latency is too high for sub-second responses |
| Perfect accuracy requirements | Hallucination risk makes 100% accuracy impossible |
| Proprietary data dependence | The model lacks necessary context and cannot acquire it from prompts alone |
| Sequential dependencies | Each step depends heavily on the previous result, compounding errors |
| Deterministic output requirements | Same input must produce identical output, which LLMs cannot guarantee |

### The Manual Prototype Step

Always validate task-model fit with a manual test before investing in automation. Copy one representative input into the model interface, evaluate the output quality, and use the result to answer these questions:

- Does the model have the knowledge required for this task?
- Can the model produce output in the format needed?
- What level of quality should be expected at scale?
- Are there obvious failure modes to address?

Do this because a failed manual prototype predicts a failed automated system, while a successful one provides both a quality baseline and a prompt-design template. The test takes minutes and prevents hours of wasted development.

### Pipeline Architecture

Structure LLM projects as staged pipelines because separation of deterministic and non-deterministic stages enables fast iteration and cost control. Design each stage to be:

- **Discrete**: Clear boundaries between stages so each can be debugged independently
- **Idempotent**: Re-running produces the same result, preventing duplicate work
- **Cacheable**: Intermediate results persist to disk, avoiding expensive re-computation
- **Independent**: Each stage can run separately, enabling selective re-execution

**Use this canonical pipeline structure:**

```
acquire -> prepare -> process -> parse -> render
```

1. **Acquire**: Fetch raw data from sources (APIs, files, databases)
2. **Prepare**: Transform data into prompt format
3. **Process**: Execute LLM calls (the expensive, non-deterministic step)
4. **Parse**: Extract structured data from LLM outputs
5. **Render**: Generate final outputs (reports, files, visualizations)

Stages 1, 2, 4, and 5 are deterministic. Stage 3 is non-deterministic and expensive. Maintain this separation because it allows re-running the expensive LLM stage only when necessary, while iterating quickly on parsing and rendering.

### File System as State Machine

Use the file system to track pipeline state rather than databases or in-memory structures, because file existence provides natural idempotency and human-readable debugging.

```
data/{id}/
  raw.json         # acquire stage complete
  prompt.md        # prepare stage complete
  response.md      # process stage complete
  parsed.json      # parse stage complete
```

Check if an item needs processing by checking whether the output file exists. Re-run a stage by deleting its