Skill116 repo starsupdated 5d ago

agentic-engineering

Design or refactor agent skills, workflows, and operating loops for model-native Agentic Engineering. Use when making skills more autonomous, concise, verifiable, long-horizon capable, token-efficient, and lower-friction for human-LLM collaboration.

View source Repository: third-brain-v5-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/Mark393295827/third-brain-v5-skills /tmp/agentic-engineering && cp -r /tmp/agentic-engineering/skills/agentic-engineering ~/.claude/skills/agentic-engineering

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Agentic Engineering

Refactor a skill or workflow so the model can execute more work with less human steering, while preserving verification, provenance, security, and the professional quality ceiling.

Agentic Engineering is the step after vibe coding: speed is useful only if quality does not degrade. Humans keep ownership of taste, judgment, architecture, and risk boundaries; agents execute bounded macro actions with evidence.

## Agent Understanding Model

Use the Karpathy-style LLM OS mapping as the design baseline:

| OS concept | Agent workflow meaning |
|---|---|
| LLM = CPU | The model runs the next reasoning/action loop. |
| Context = RAM | Load only the state needed for the next step. |
| Wiki/logs = Disk | Durable knowledge lives outside chat memory. |
| Tools = System calls | Actions must have contracts, permissions, and evidence. |
| Loop = Scheduler | Plan -> Act -> Observe -> Iterate controls work. |

The agent is a process with state, tools, permissions, and write-back duties. Do not design skills as advice pages; design them as executable control loops.

## Full-Stack Agent Pattern

Use the Google I/O '26 wiki update as the new maturity signal: useful agents are moving from chat boxes into full-stack product surfaces.

| Surface | Agentic engineering requirement |
|---|---|
| Developer IDE or CLI | subagents, hooks, async queues, tests, diffs, task state |
| Personal agent | user intent, memory boundary, tool allowlist, resumable tasks |
| Agentic search | source grounding, comparison criteria, reversible action preview |
| Agentic commerce | explicit mandate, budget, payment boundary, audit trail |
| Generative media | provenance, edit history, watermark or disclosure path |
| Ambient device | sensor boundary, privacy mode, interrupt and fallback controls |

Do not treat these as separate prompt styles. They are one full-stack agent design problem: model -> context -> tool calls -> product surface -> verification -> governance.

## Workflow Complexity Gate

Before adding orchestration, classify the work at the lowest sufficient level:

| Level | Use when | Gate before upgrading |
|---|---|---|
| Prompt | One-off answer, small edit, short analysis | Is the work repeating? |
| Skill | Reusable workflow or domain method | Does it need isolated execution? |
| Subagent | Independent side task or context isolation | Does it need communication or shared state? |
| Agent team | Multiple roles must coordinate or review each other | Can file ownership, IPC, and join gates be defined? |
| Long-running goal | Depth problem: iterate until objective criteria pass | Is `done` externally verifiable and budgeted? |
| Dynamic workflow | Width problem: many independent shards can run in parallel | Is script review, cost envelope, and runtime observability in place? |

Default to the lower level when value, independence, or verification is unclear. Higher orchestration increases token cost, permission surface, and audit burden.

## Usage Template

**Prompt**
```text
Use agentic-engineering to revise this skill/workflow. Make it model-native, concise, autonomous, verifiable, and long-horizon capable.
```

**Use Case**
- Improving an existing skill, SOP, command, agent workflow, or multi-agent plan.

**Expected Result**
- A compact agentic workflow with clear defaults, state checkpoints, verification gates, and escalation rules.

**Output Example**
- A revised `SKILL.md` with model assumptions, autonomous execution loop, quality gates, and anti-patterns.

**Verification Case**
- A fresh agent can run the workflow without repeated clarification and can prove completion with evidence.

**Verified Effect**
- Human coordination load drops; the LLM spends fewer tokens asking for obvious decisions and more tokens executing, checking, and learning.

## Success Metrics

- Revised workflow has one trigger, one bounded macro action, one state checkpoint, one verification gate, and one write-back target.
- The skill or workflow can be executed without extra clarification for its primary use case.
- Residual human judgment points are explicit rather than hidden in vague prose.

## Model Meta-Properties

Design around the model as it is, not as a human assistant metaphor.

| Meta-property | Skill design response |
|---|---|
| Context is scarce working memory | Keep `SKILL.md` short; move examples/details to references; load only what is needed. |
| Output is probabilistic | Require tests, citations, diffs, screenshots, or link checks before claims. |
| Tool use is the action layer | Name allowed tools, denied actions, and idempotent retries. |
| Long-horizon drift is normal | Add checkpoints, state files, stop criteria, and recovery paths. |
| Durable knowledge is external | Persist reusable results into wiki, docs, logs, or state files. |
| The model is strong at synthesis | Do not over-explain generic concepts; specify local constraints and unusual rules. |
| The model can over-ask | Provide safe defaults and ask only for irreversible, high-risk, or genuinely ambiguous decisions. |
| Cost and latency matter | Route simple work to thin loops; reserve deep context for high-uncertainty decisions. |

## Workflow

### 1. Define the Quality Ceiling

Name the standard that must not drop:

```text
Quality ceiling:
User-visible risk:
Security risk:
Verification evidence:
Human judgment required:
```

If quality cannot be measured or inspected, do not delegate broad autonomous work yet.

### 2. Compress the Contract

State the workflow in one sentence:

```text
Input -> transformation -> durable output -> verification evidence
```

If this sentence is vague, fix it before adding steps.

### 3. Write the Macro Action Spec

Before assigning a large unit of work, define:

```text
Objective:
Scope:
Non-goals:
Inputs:
Owned files or territory:
Expected output:
Verification evidence:
Security/risk review:
Write-back destination:
Stop condition:
```

Treat features, research, plans, and verification as macro actions