Skip to main content
ClaudeWave
Skill116 repo starsupdated 5d ago

agentic-engineering

Design or refactor agent skills, workflows, and operating loops for model-native Agentic Engineering. Use when making skills more autonomous, concise, verifiable, long-horizon capable, token-efficient, and lower-friction for human-LLM collaboration.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/Mark393295827/third-brain-v5-skills /tmp/agentic-engineering && cp -r /tmp/agentic-engineering/skills/agentic-engineering ~/.claude/skills/agentic-engineering
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Agentic Engineering

Refactor a skill or workflow so the model can execute more work with less human steering, while preserving verification, provenance, security, and the professional quality ceiling.

Agentic Engineering is the step after vibe coding: speed is useful only if quality does not degrade. Humans keep ownership of taste, judgment, architecture, and risk boundaries; agents execute bounded macro actions with evidence.

## Agent Understanding Model

Use the Karpathy-style LLM OS mapping as the design baseline:

| OS concept | Agent workflow meaning |
|---|---|
| LLM = CPU | The model runs the next reasoning/action loop. |
| Context = RAM | Load only the state needed for the next step. |
| Wiki/logs = Disk | Durable knowledge lives outside chat memory. |
| Tools = System calls | Actions must have contracts, permissions, and evidence. |
| Loop = Scheduler | Plan -> Act -> Observe -> Iterate controls work. |

The agent is a process with state, tools, permissions, and write-back duties. Do not design skills as advice pages; design them as executable control loops.

## Full-Stack Agent Pattern

Use the Google I/O '26 wiki update as the new maturity signal: useful agents are moving from chat boxes into full-stack product surfaces.

| Surface | Agentic engineering requirement |
|---|---|
| Developer IDE or CLI | subagents, hooks, async queues, tests, diffs, task state |
| Personal agent | user intent, memory boundary, tool allowlist, resumable tasks |
| Agentic search | source grounding, comparison criteria, reversible action preview |
| Agentic commerce | explicit mandate, budget, payment boundary, audit trail |
| Generative media | provenance, edit history, watermark or disclosure path |
| Ambient device | sensor boundary, privacy mode, interrupt and fallback controls |

Do not treat these as separate prompt styles. They are one full-stack agent design problem: model -> context -> tool calls -> product surface -> verification -> governance.

## Workflow Complexity Gate

Before adding orchestration, classify the work at the lowest sufficient level:

| Level | Use when | Gate before upgrading |
|---|---|---|
| Prompt | One-off answer, small edit, short analysis | Is the work repeating? |
| Skill | Reusable workflow or domain method | Does it need isolated execution? |
| Subagent | Independent side task or context isolation | Does it need communication or shared state? |
| Agent team | Multiple roles must coordinate or review each other | Can file ownership, IPC, and join gates be defined? |
| Long-running goal | Depth problem: iterate until objective criteria pass | Is `done` externally verifiable and budgeted? |
| Dynamic workflow | Width problem: many independent shards can run in parallel | Is script review, cost envelope, and runtime observability in place? |

Default to the lower level when value, independence, or verification is unclear. Higher orchestration increases token cost, permission surface, and audit burden.

## Usage Template

**Prompt**
```text
Use agentic-engineering to revise this skill/workflow. Make it model-native, concise, autonomous, verifiable, and long-horizon capable.
```

**Use Case**
- Improving an existing skill, SOP, command, agent workflow, or multi-agent plan.

**Expected Result**
- A compact agentic workflow with clear defaults, state checkpoints, verification gates, and escalation rules.

**Output Example**
- A revised `SKILL.md` with model assumptions, autonomous execution loop, quality gates, and anti-patterns.

**Verification Case**
- A fresh agent can run the workflow without repeated clarification and can prove completion with evidence.

**Verified Effect**
- Human coordination load drops; the LLM spends fewer tokens asking for obvious decisions and more tokens executing, checking, and learning.

## Success Metrics

- Revised workflow has one trigger, one bounded macro action, one state checkpoint, one verification gate, and one write-back target.
- The skill or workflow can be executed without extra clarification for its primary use case.
- Residual human judgment points are explicit rather than hidden in vague prose.

## Model Meta-Properties

Design around the model as it is, not as a human assistant metaphor.

| Meta-property | Skill design response |
|---|---|
| Context is scarce working memory | Keep `SKILL.md` short; move examples/details to references; load only what is needed. |
| Output is probabilistic | Require tests, citations, diffs, screenshots, or link checks before claims. |
| Tool use is the action layer | Name allowed tools, denied actions, and idempotent retries. |
| Long-horizon drift is normal | Add checkpoints, state files, stop criteria, and recovery paths. |
| Durable knowledge is external | Persist reusable results into wiki, docs, logs, or state files. |
| The model is strong at synthesis | Do not over-explain generic concepts; specify local constraints and unusual rules. |
| The model can over-ask | Provide safe defaults and ask only for irreversible, high-risk, or genuinely ambiguous decisions. |
| Cost and latency matter | Route simple work to thin loops; reserve deep context for high-uncertainty decisions. |

## Workflow

### 1. Define the Quality Ceiling

Name the standard that must not drop:

```text
Quality ceiling:
User-visible risk:
Security risk:
Verification evidence:
Human judgment required:
```

If quality cannot be measured or inspected, do not delegate broad autonomous work yet.

### 2. Compress the Contract

State the workflow in one sentence:

```text
Input -> transformation -> durable output -> verification evidence
```

If this sentence is vague, fix it before adding steps.

### 3. Write the Macro Action Spec

Before assigning a large unit of work, define:

```text
Objective:
Scope:
Non-goals:
Inputs:
Owned files or territory:
Expected output:
Verification evidence:
Security/risk review:
Write-back destination:
Stop condition:
```

Treat features, research, plans, and verification as macro actions
daily-okrSkill

Execute a daily knowledge compound closed loop — 7 Key Results from input to feedback with scoring. Use when the user wants to do a daily review, plan their day, or run a knowledge workflow.

session-learnSkill

Extract reusable knowledge from a work session and save concepts, entities, corrections, patterns, ideas, decisions, and gaps to the wiki. Use when ending a session or when the user says to extract knowledge.

token-cost-trackerSlash Command

Estimate and track token usage and cost across the knowledge pipeline. Run before expensive tasks to budget, after tasks to log actuals.

wiki-lintSkill

Health-check the knowledge wiki — find orphans, broken links, missing frontmatter, contradictions, stale content, and statistical drift. Use when the user says "lint the wiki", "health check", or periodically for maintenance.

agent-teams-commandSkill

Command multi-agent work with bounded roles, ownership, integration gates, and verification loops. Use when the user needs Claude Code Agent Teams, parallel agents, delegation strategy, or multi-agent orchestration.

ai-six-sigma-property-osSkill

Design an AI Six Sigma Black Belt operating model for property service, maintenance dispatch, environmental testing, quote generation, CRM follow-up, and workflow quality dashboards. Use when the user needs a Property Agent OS, AI + Ontology + DMAIC management system, CTQ metrics, agent-team roles, work-order states, or MVP roadmap for operations quality.

anthropic-osSkill

Improve a personal or team operating system with self-evolving loops, CASH allocation, 3B creativity, predictive coding, and diagnostics. Use when the user wants to redesign a work method, learning loop, or cognitive operating system.

behavior-designSkill

Design a behavior change system — decompose a goal into minimum habits, define triggers, build SOPs, and set up review cycles. Use when the user wants to build a habit, change behavior, or achieve a personal goal.