Skill4.3k repo starsupdated yesterday

world-model-diagnostic

The world-model-diagnostic skill audits how a company routes information, identifies where human judgment still operates despite appearing automated, and recommends a prioritized build sequence. Use this when designing or troubleshooting a system's data architecture, particularly to expose gaps between how information flows in reality versus how it's documented to flow, and to clarify what should be built first to establish proper boundaries between data and interpretation.

View source Repository: OB1

Install in Claude Code

Copy

git clone --depth 1 https://github.com/NateBJones-Projects/OB1 /tmp/world-model-diagnostic && cp -r /tmp/world-model-diagnostic/skills/world-model-diagnostic ~/.claude/skills/world-model-diagnostic

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# World Model Diagnostic

## Purpose

Your job is not to hand back a polished readiness score. Your job is to expose
where information routing ends and editorial judgment begins, then recommend the
smallest credible starting sequence.

This diagnostic exists to answer five questions:

1. Where does reality leave the clearest fingerprint in this business?
2. Which world-model paradigm fits the company right now?
3. Does the company have an explicit boundary layer?
4. Where is it most exposed to simulated judgment?
5. What should it build first, second, and third?

## Modes

Run in one of two modes:

- `OB1-connected mode`: if Open Brain search/capture tools are available, use
them to look for prior context and persist the intake, boundary audit, and
final assessment.
- `Direct-chat mode`: if those tools are missing, run the exact same diagnostic
and clearly say that the output is not being persisted.

## Preferred Tools

Before starting, identify whether the current client exposes:

- a base Open Brain search tool, usually `search_thoughts`
- a base Open Brain capture tool, usually `capture_thought`

Do not assume exact prefixes. Use the tool names visible in the client.

If search is available:

- run 2-4 narrow queries such as `world model`, `boundary layer`, the company
name, or obvious strategic context
- treat every result as a hint, not as a confirmed fact

If capture is available:

- tell the user you will persist three lean artifacts unless they ask you not to:
- intake summary
- boundary audit summary
- final assessment

## Non-Negotiable Rules

1. Do not give a numeric readiness score.
2. Label every conclusion as one of:
- `Firm finding`: directly supported by the user's answer or a confirmed prior record
- `Inference`: your best synthesis from the available evidence
- `Open question`: unresolved or missing evidence that materially affects the recommendation
3. Keep the boundary layer central. Database choice is downstream of boundary clarity.
4. Start concrete, not abstract. Ask about recent information flows, recent decisions, and recent misses.
5. Force ranking when discussing signal. Ask the user to rank the top 3-5 sources by fidelity.
6. Audit actual flows, not aspirational diagrams.
7. Do not let the model pretend judgment has been automated when the evidence shows interpretation still lives in people.
8. The final recommendation must include:
- paradigm fit
- boundary-layer status
- top 3 simulated-judgment exposures
- first, second, and third build steps
9. The diagnostic's own output must model the thesis. Facts and interpretations cannot be presented with the same voice.
10. Stay lightweight. Batch questions so the session can finish in about 20 minutes.

## Paradigm Mapping Contract

Map the company using these rules:

- Under 100 people plus a strong senior team:
- default to `vector database`
- reason: senior people can temporarily act as the human boundary layer
- Enterprise, regulated, or operationally complex:
- default to `structured ontology`
- reason: the boundary has to be architectural because errors are expensive
- Platform business with genuinely high-fidelity signal such as transactions,
telemetry, or operational exhaust:
- default to `signal-fidelity`
- reason: the business already emits machine-readable truth with a higher ceiling
- Knowledge-work company running mostly on conversations, docs, and soft context:
- treat as the hardest and most common case
- still map to `vector database`
- pair it with aggressive boundary-layer work first and explicit outcome encoding from day one

When cues conflict, use this priority:

1. highest-fidelity signal
2. cost of a bad interpretive decision
3. amount of senior human judgment still available to absorb errors

## Five-Principle Evaluation

Evaluate the company against these principles without scoring them numerically:

- `signal fidelity`
- Where does reality leave the clearest fingerprint?
- Classify as `clear`, `mixed`, or `low`
- `earned structure`
- Are they letting structure emerge from observed work, or forcing a schema too early?
- Classify as `earned`, `partially earned`, or `imposed`
- `outcome encoding`
- Do they close the loop between action and result in a machine-readable way?
- Classify as `present`, `partial`, or `missing`
- `organizational resistance`
- Does the system capture signal as a byproduct of work or require extra documentation?
- Classify as `byproduct`, `mixed`, or `manual`
- `time in system`
- How long has relevant data been flowing through anything durable?
- Classify as `running`, `starting`, or `not started`

## Workflow

### Phase 1: Orientation

1. If search is available, pull only enough context to avoid asking already-answered basics.
2. Tell the user what the diagnostic will do:
- intake on signal, data, and decision flow
- paradigm classification
- boundary audit on the highest-value information flows
- final assessment with fact-vs-inference labels
3. If capture is available, state that the session will be persisted into Open Brain.

### Phase 2: Intake

Keep the intake to 2-3 batches, not a long list of isolated questions.

Required coverage:

- company size, industry, and business model
- whether the environment is regulated, safety-critical, or high-cost-of-error
- top 3-5 data sources ranked by fidelity
- where decisions currently get made
- where editorial judgment currently lives
- which management or synthesis layers have already been removed or thinned out
- how outcomes are recorded today
- whether data capture is a byproduct of work or a separate burden
- how long any durable system has been running

Strong prompt patterns:

- "What are the three places reality leaves the cleanest fingerprint in this business?"
- "Which decisions still depend on someone saying, 'ignore that, that's normal'?"
- "Where did you remove a human layer and keep the information flow, but lose the int