Skill6.4k repo starsupdated today

deep-research

Deep-research structures multi-round investigations into a topic by explicitly planning sub-questions, iteratively gathering evidence across sources, and compiling a synthesized report with per-claim citations. Use it when responding to requests for "research reports," "literature reviews," or "deep dives" where a single-pass summary would lose critical nuance or when cross-source verification is essential. The skill maintains a persistent state file across iterations, tracks coverage progress, and coordinates web fetches through the host agent without performing them directly.

View source Repository: opensquilla

Install in Claude Code

Copy

git clone --depth 1 https://github.com/opensquilla/opensquilla /tmp/deep-research && cp -r /tmp/deep-research/src/opensquilla/skills/bundled/deep-research ~/.claude/skills/deep-research

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# deep-research

Investigate a question by walking it through three explicit stages with a
persisted state file. Use this when a single-pass `summarize` would lose too
much, or when the user asks for a "research report" / "literature review".
The host agent does the web fetching; this skill structures the work and
keeps a paper trail.

## Decide if this is the right tool

| Need | Use |
|---|---|
| One-line summary of an article | `summarize` |
| Multi-round investigation with citations | this skill |
| Quick lookup, single source | direct web search |
| Continuous monitoring of a topic | a digest/cron skill |

## Stages

```
Scope → Plan → Iterate (×N) → Compile → Deliver
```

State persists in a single JSON file you pass between stages. The file is
the contract; if you can describe the file, you can resume the research at
any point.

---

## Stage 1: Plan

```bash
python {baseDir}/scripts/plan.py \
    --question "How did Manus differentiate from competing AI agents in 2025?" \
    --depth thorough \
    --out plan.json
```

`--depth` choices:

- `overview` — 3-5 sub-questions, target 1 source per sub-question
- `thorough` — 6-10 sub-questions, target 2-3 sources per sub-question
- `exhaustive` — 12-20 sub-questions, target 5+ sources per sub-question

The plan is a `pydantic` model serialized to JSON; see
[references/methodology.md](references/methodology.md) for the schema and
the system-review approach the depth choices implement.

---

## Stage 2: Iterate

Each round: read the plan, decide which sub-questions need attention,
print the fetch list for the host agent to execute, and (after the agent
returns results) record evidence back into the plan.

```bash
# Show the host what to fetch this round
python {baseDir}/scripts/iterate.py --plan plan.json --round 1 --print-fetches

# After the host fetches, record results back
python {baseDir}/scripts/iterate.py --plan plan.json --round 1 \
    --record evidence_round_1.json
```

`evidence_round_1.json`:

```json
[
  {
    "subquestion_id": "sq-002",
    "url": "https://...",
    "title": "...",
    "excerpt": "...",
    "relevance": 0.85,
    "fetched_at": "2026-05-06T10:14:00Z"
  }
]
```

The script updates per-sub-question coverage estimates. When all
sub-questions reach the depth-target coverage, the plan's `done` flag flips
to `true` and the iteration loop terminates.

See [references/sources.md](references/sources.md) for the 5-axis source
evaluation (Authority, Recency, Evidence, Bias, Corroboration) you should
apply when judging relevance.

---

## Stage 3: Compile

```bash
python {baseDir}/scripts/compile.py --plan plan.json --out report.md
```

Output is markdown with:

1. Executive summary (5-8 lines)
2. Methodology block (depth, rounds, source count)
3. Per-sub-question section with embedded citations `[^N]`
4. References block listing every source with URL + fetched_at + relevance
5. "What this report does not cover" — explicit gaps from low-coverage
   sub-questions

Citations link to the references block. The compile step never invents
sources — every `[^N]` in the body must correspond to an entry recorded in
stage 2.

---

## Boundaries

- This skill does not fetch the web itself. It is a methodology + state
  manager. Pair it with the host agent's web search/fetch tools.
- It does not resolve contradictions among sources automatically. The
  compile step will note conflicting evidence in the report; the user
  decides which side wins.
- It is not a fact-checker. Source quality scoring is heuristic; treat the
  output as a starting point, not a verdict.
- For ongoing monitoring (daily digests, RSS-style updates) build a cron
  skill that calls this one with a fresh question each cycle.

---

## Differentiation from `summarize`

`summarize` takes one document and produces a shorter version. This skill
takes one question and produces a researched report drawing on many
documents, with explicit evidence tracking. They share no trigger words by
design — `summarize` triggers on "summarize", "shorten", "tl;dr"; this
skill triggers on "research", "investigate", "literature review", "deep
dive".