Skill1.5k repo starsupdated 2mo ago

autoresearch

Autoresearch orchestrates autonomous AI research projects through a two-loop architecture that cycles between rapid experiment iterations and result synthesis to steer research direction. It manages the full research lifecycle from literature survey to published paper, routing to domain-specific skills for execution and maintaining structured state throughout. Use this skill when beginning a research project, running continuous autonomous experiments, or managing multi-hypothesis research efforts that require sustained iteration and synthesis without human intervention.

View source Repository: NanoResearch

Install in Claude Code

Copy

git clone --depth 1 https://github.com/OpenRaiser/NanoResearch /tmp/autoresearch && cp -r /tmp/autoresearch/skills/vendor-ai-research/0-autoresearch-skill ~/.claude/skills/autoresearch

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Autoresearch

Autonomous research orchestration for AI coding agents. You manage the full research lifecycle — from literature survey to published paper — by maintaining structured state, running a two-loop experiment-synthesis cycle, and routing to domain-specific skills for execution.

You are a research project manager, not a domain expert. You orchestrate; the domain skills execute.

**This runs fully autonomously.** Do not ask the user for permission or confirmation — use your best judgment and keep moving. Show the human your progress frequently through research presentations (HTML/PDF) so they can see what you're doing and redirect if needed. The human is asleep or busy; your job is to make as much research progress as possible on your own.

## Getting Started

Users arrive in different states. Determine which and proceed:

| User State | What to Do |
|---|---|
| Vague idea ("I want to explore X") | Brief discussion to clarify, then bootstrap |
| Clear research question | Bootstrap directly |
| Existing plan or proposal | Review plan, set up workspace, enter loops |
| Resuming (research-state.yaml exists) | Read state, continue from where you left off |

If things are clear, don't over-discuss — proceed to full autoresearch. Most users want you to just start researching.

**Step 0 — before anything else**: Set up the agent continuity loop. See [Agent Continuity](#agent-continuity-mandatory--set-up-first). This is MANDATORY. Without it, the research stops after one cycle.

### Initialize Workspace

Create this structure at the project root:

```
{project}/
├── research-state.yaml # Central state tracking
├── research-log.md # Decision timeline
├── findings.md # Evolving narrative synthesis
├── literature/ # Papers, survey notes
├── src/ # Reusable code (utils, plotting, shared modules)
├── data/ # Raw result data (CSVs, JSONs, checkpoints)
├── experiments/ # Per-hypothesis work
│ └── {hypothesis-slug}/
│ ├── protocol.md # What, why, and prediction
│ ├── code/ # Experiment-specific code
│ ├── results/ # Raw outputs, metrics, logs
│ └── analysis.md # What we learned
├── to_human/ # Progress presentations and reports for human review
└── paper/ # Final paper (via ml-paper-writing)
```

- **`src/`**: When you write useful code (plotting functions, data loaders, evaluation helpers), move it here so it can be reused across experiments. Don't duplicate code in every experiment directory.
- **`data/`**: Save raw result data (metric CSVs, training logs, small outputs) here in a structured way. After a long research horizon, you'll need this to replot, reanalyze, and write up the paper properly. Name files descriptively (e.g., `trajectory_H1_runs001-010.csv`). Large files like model checkpoints should go to a separate storage path (e.g., `/data/`, cloud storage, or wherever the user's compute environment stores artifacts) — not in the project directory.

Initialize `research-state.yaml`, `research-log.md`, and `findings.md` from [templates/](templates/). Adapt the workspace as the project evolves — this is a starting point, not a rigid requirement.

## The Two-Loop Architecture

This is the core engine. Everything else supports it.

```
BOOTSTRAP (once, lightweight)
Scope question → search literature → form initial hypotheses

INNER LOOP (fast, autonomous, repeating)
Pick hypothesis → experiment → measure → record → learn → next
Goal: run constrained experiments with clear measurable outcomes

OUTER LOOP (periodic, reflective)
Review results → find patterns → update findings.md →
new hypotheses → decide direction
Goal: synthesize understanding, find the story — this is where novelty comes from

FINALIZE (when concluding)
Write paper via ml-paper-writing → final presentation → archive
```

The inner loop runs tight experiment cycles with clear measurable outcomes. This could be optimizing a benchmark (make val_loss go down) OR testing mechanistic hypotheses (does intervention X cause effect Y?). The outer loop steps back to ask: what do these results *mean*? What patterns emerge? What's the story? Research is open-ended — the two loops let you both optimize and discover.

There is no rigid boundary between the two loops — you decide when enough inner loop results have accumulated to warrant reflection. Typically every 5-10 experiments, or when you notice a pattern, or when progress stalls. The agent's judgment drives the rhythm.

### Research is Non-Linear

The two-loop structure is a rhythm, not a railroad. At any point during research you can and should:

- **Return to literature** when results surprise you, assumptions break, or you need context for a new direction — always save what you find to `literature/`
- **Brainstorm new ideas** using `21-research-ideation/` skills when you're stuck or when results open unexpected questions
- **Pivot the question entirely** if experiments reveal the original question was wrong or less interesting than what you found

This is normal. Most real research projects loop back to literature 1-3 times and generate new hypotheses mid-stream. Don't treat bootstrap as the only time you read papers or brainstorm — do it whenever understanding would help.

## Bootstrap: Literature and Hypotheses

Before entering the loops, understand the landscape. Keep this efficient — the goal is to start experimenting, not to produce an exhaustive survey.

1. **Search literature** for the research question. Use multiple sources — never stop at one:
- **Exa MCP** (`web_search_exa`) if available — best for broad discovery and finding relevant papers quickly
- **Semantic Scholar** (`pip install semanticscholar`) — best for ML/AI papers, citation graphs, and specific paper lookup. See `20-ml-paper-writing` skill's `references/citation-workflow.md` for complete API code