research
The research skill reads a materialized research source from a specified experiment directory and extracts actionable implementation details into structured JSON format. Use this skill in Phase 2, after analyzing the current implementation, to synthesize findings from papers, datasets, notebooks, or user ideas into a concrete method plan with identified pros, cons, and implementation requirements.
git clone --depth 1 https://github.com/Upsonic/Upsonic /tmp/research && cp -r /tmp/research/src/upsonic/prebuilt/applied_scientist/template/skills/research ~/.claude/skills/researchSKILL.md
# Research Skill
## Purpose
Read the materialized research source and extract actionable information needed to implement the proposed method. Record the findings as a structured JSON entry.
## When to Use
Phase 2 — after the current implementation has been analyzed.
## Input
| Parameter | Type | Description |
|-----------|------|-------------|
| experiment_path | path | `experiments/{research_name}/` |
The research source was materialized into `{experiment_path}` during Phase 0. Its local path is recorded in `{experiment_path}/log.json` under `metadata.research_source`, and a short descriptive label Phase 0 chose is under `metadata.research_source_kind`. The label is free-form (common values: `pdf`, `file`, `git`, `kaggle_notebook`, `kaggle_dataset`, `arxiv`, `huggingface_model`, `html`, `idea`, `other`), but treat it as a hint only — always follow the actual path in `metadata.research_source`.
Inspect that path and read whatever is there:
- A single file (PDF, Markdown, HTML, `.ipynb`, text, …) → read it directly.
- A directory → read the obvious entry points first (`README*`, `*.ipynb`, top-level notebooks or code, `docs/`, dataset descriptions), then skim the rest as needed.
- A text **idea** (`research_source_kind == "idea"`, typically a short `research_source.md`) → read the user's description carefully and turn it into a concrete method plan. Pick a specific algorithm / library that matches the description, define the hyperparameters you will use, and document your interpretation explicitly in the Phase 2 log entry. If the idea is ambiguous, commit to a reasonable default and note the trade-off — do not invent a citation or claim the idea came from a paper.
Do not try to re-fetch the source. If the content is insufficient, note what is missing in the Phase 2 log entry and proceed with the best analysis you can.
## Actions
1. **Read the materialized research source** at `metadata.research_source` (falling back to `research.pdf` for legacy experiments) and extract:
- **Method Summary:** 2-3 short paragraphs describing what the paper proposes, what problem it solves, and how it differs from traditional approaches.
- **Pros:** each advantage the paper claims or demonstrates.
- **Cons:** stated or inferred limitations, assumptions, or weaknesses.
- **Implementation Requirements:**
- Required libraries/packages (with versions if specified)
- Required data format or preprocessing
- Required compute resources (GPU, memory, etc.)
- Key hyperparameters to set
- **Compatibility Analysis:**
- Can the method use the same data as the current baseline?
- Does it need different preprocessing?
- Does it output comparable predictions (same format)?
- Can the same metrics be used for comparison?
2. **Append a Phase 2 entry to `{experiment_path}/log.json`** under `phases`:
```json
{
"name": "Phase 2: Research",
"completed_at": "2026-04-17T10:30:00Z",
"paper": {
"title": "CatBoost: Unbiased Boosting with Categorical Features",
"authors": ["Prokhorenkova et al."],
"method_summary": "CatBoost is a gradient-boosting framework that handles categorical features natively via ordered target statistics and uses oblivious decision trees to reduce overfitting."
},
"pros": [
"Native categorical handling — no manual encoding needed",
"Reduces target leakage with ordered boosting",
"Strong out-of-the-box performance"
],
"cons": [
"Training slower than XGBoost for small data",
"More memory intensive"
],
"requirements": {
"new_dependencies": ["catboost>=1.2"],
"data_format": "pandas.DataFrame with categorical columns marked",
"compute": "CPU is sufficient; GPU optional"
},
"compatibility": {
"same_data": true,
"same_metrics": true,
"preprocessing_notes": "CatBoost takes raw categorical columns; do NOT pre-encode them for the new notebook."
}
}
```
Do not overwrite earlier entries; append to the `phases` array.
## Output
- `{experiment_path}/log.json` — updated with Phase 2 research entry
- No other files created or modifiedUse this agent when you need to create unit tests for your code in unittest.TestCase format, organized in a tests folder with concept-based subfolders. Examples: <example>Context: User has just written a new authentication module and needs comprehensive unit tests. user: 'I just finished writing my user authentication functions in auth.py. Can you help me create unit tests for them?' assistant: 'I'll use the unittest-generator agent to create comprehensive unit tests for your authentication module.' <commentary>Since the user needs unit tests created for their authentication code, use the unittest-generator agent to create properly structured tests in the tests folder with appropriate subfolder organization.</commentary></example> <example>Context: User has implemented new data validation functions and wants to ensure they're properly tested. user: 'I've added several validation functions to my utils.py file. I need unit tests to make sure they handle edge cases correctly.' assistant: 'Let me use the unittest-generator agent to create thorough unit tests for your validation functions.' <commentary>The user needs unit tests for their validation functions, so use the unittest-generator agent to create comprehensive tests with edge case coverage.</commentary></example>
Perform structured code reviews with actionable feedback. Use when a user asks to review code, check code quality, find bugs, audit security, improve performance, or assess maintainability. Trigger when user says things like "review this code", "check for bugs", "is this code secure", "any issues with this", "code quality check", or pastes code asking for feedback. Also trigger for pull request reviews and pre-merge code checks. Do NOT trigger for writing new code from scratch, refactoring requests without review context, or general programming questions.