Skill4.9k estrellas del repoactualizado 12d ago

ml-paper-writing

This Claude Code skill guides researchers in writing publication-ready machine learning and AI papers for top-tier venues including NeurIPS, ICML, ICLR, ACL, AAAI, and COLM. It provides LaTeX templates, citation verification workflows, claim validation gates, and conference-specific checklists while drawing on best practices from leading researchers. Use this skill when drafting papers from research repositories, conducting literature reviews, verifying citations against canonical sources like Semantic Scholar and CrossRef, preparing camera-ready submissions, or refining manuscripts through collaborative feedback cycles.

Ver fuente Repositorio: claude-scholar

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/Galaxy-Dawn/claude-scholar /tmp/ml-paper-writing && cp -r /tmp/ml-paper-writing/skills/ml-paper-writing ~/.claude/skills/ml-paper-writing

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# ML Paper Writing for Top AI Conferences

Expert-level guidance for writing publication-ready papers targeting **NeurIPS, ICML, ICLR, ACL, AAAI, and COLM**. This skill combines writing philosophy from top researchers (Nanda, Farquhar, Karpathy, Lipton, Steinhardt) with practical tools: LaTeX templates, citation verification APIs, and conference checklists.

## Default operating order

Use this skill in the following order unless the task is unusually narrow:
1. lock the operating mode from `references/OPERATING-MODES.md`,
2. understand the repo or draft context,
3. use `references/citation-workflow.md` as the **canonical citation authority**,
4. load venue- or template-specific references only after the main writing path is clear.

Google Scholar may still help with manual discovery, but it is **not** the canonical verification authority in this skill. Default verification should use programmatic sources such as Semantic Scholar, CrossRef, and arXiv.

## Claim ledger gate

Before a project plan, experiment note, or literature summary becomes manuscript prose:
- identify the Claim Candidate or Evidence Record that supports the sentence,
- preserve allowed wording and forbidden stronger wording,
- keep project plans as hypotheses unless experiment artifacts or verified papers support them,
- do not turn related-work motivation into evidence for the paper's own result,
- mark unsupported claims as `[CLAIM NEEDS EVIDENCE]` instead of polishing them.

If the repo context is clear enough for a first draft, still apply this gate before stating contributions, results, related-work contrasts, or rebuttal-facing claims.

## Core Philosophy: Collaborative Writing

**Paper writing is collaborative, but Claude should be proactive in delivering drafts.**

The typical workflow starts with a research repository containing code, results, and experimental artifacts. Claude's role is to:

1. **Understand the project** by exploring the repo, results, and existing documentation
2. **Deliver a complete first draft** when confident about the contribution
3. **Search literature** using web search and APIs to find relevant citations
4. **Refine through feedback cycles** when the scientist provides input
5. **Ask for clarification** only when genuinely uncertain about key decisions

**Key Principle**: Be proactive. If the repo and results are clear, deliver a full draft. Don't block waiting for feedback on every section—scientists are busy. Produce something concrete they can react to, then iterate based on their response.

---

## ⚠️ CRITICAL: Never Hallucinate Citations

**This is the most important rule in academic writing with AI assistance.**

### The Problem
AI-generated citations have a **~40% error rate**. Hallucinated references—papers that don't exist, wrong authors, incorrect years, fabricated DOIs—are a serious form of academic misconduct that can result in desk rejection or retraction.

### The Rule
**NEVER generate BibTeX entries from memory. ALWAYS fetch programmatically.**

| Action | ✅ Correct | ❌ Wrong |
|--------|-----------|----------|
| Adding a citation | Search API → verify → fetch BibTeX | Write BibTeX from memory |
| Uncertain about a paper | Mark as `[CITATION NEEDED]` | Guess the reference |
| Can't find exact paper | Note: "placeholder - verify" | Invent similar-sounding paper |

### When You Can't Verify a Citation

If you cannot programmatically verify a citation, you MUST:

```latex
% EXPLICIT PLACEHOLDER - requires human verification
\cite{PLACEHOLDER_author2024_verify_this} % TODO: Verify this citation exists
```

**Always tell the scientist**: "I've marked [X] citations as placeholders that need verification. I could not confirm these papers exist."

### Recommended: Install Exa MCP for Paper Search

For the best paper search experience, install **Exa MCP** which provides real-time academic search:

**Claude Code:**
```bash
claude mcp add exa -- npx -y mcp-remote "https://mcp.exa.ai/mcp"
```

**Cursor / VS Code** (add to MCP settings):
```json
{
"mcpServers": {
"exa": {
"type": "http",
"url": "https://mcp.exa.ai/mcp"
}
}
}
```

Exa MCP enables searches like:
- "Find papers on RLHF for language models published after 2023"
- "Search for transformer architecture papers by Vaswani"
- "Get recent work on sparse autoencoders for interpretability"

Then verify results with Semantic Scholar API and fetch BibTeX via DOI.

---

## Workflow 0: Starting from a Research Repository

When beginning paper writing, start by understanding the project:

```
Project Understanding:
- [ ] Step 1: Explore the repository structure
- [ ] Step 2: Read README, existing docs, and key results
- [ ] Step 3: Identify the main contribution with the scientist
- [ ] Step 4: Find papers already cited in the codebase
- [ ] Step 5: Search for additional relevant literature
- [ ] Step 6: Outline the paper structure together
- [ ] Step 7: Draft sections iteratively with feedback
```

**Step 1: Explore the Repository**

```bash
# Understand project structure
ls -la
find . -name "*.py" | head -20
find . -name "*.md" -o -name "*.txt" | xargs grep -l -i "result\|conclusion\|finding"
```

Look for:
- `README.md` - Project overview and claims
- `results/`, `outputs/`, `experiments/` - Key findings
- `configs/` - Experimental settings
- Existing `.bib` files or citation references
- Any draft documents or notes

**Step 2: Identify Existing Citations**

Check for papers already referenced in the codebase:

```bash
# Find existing citations
grep -r "arxiv\|doi\|cite" --include="*.md" --include="*.bib" --include="*.py"
find . -name "*.bib"
```

These are high-signal starting points for Related Work—the scientist has already deemed them relevant.

**Step 3: Clarify the Contribution**

Before writing, explicitly confirm with the scientist:

> "Based on my understanding of the repo, the main contribution appears to be [X].
> The key results show [Y]. Is this the framing you want for the paper,
> or shou

Del mismo repositorio

code-reviewerSubagent

Expert code review specialist. Proactively reviews code for quality, security, and maintainability. Use immediately after writing or modifying code. MUST BE USED for all code changes.

kaggle-minerSubagent

Use this agent when the user provides a Kaggle competition URL or asks to learn from Kaggle winning solutions. Examples:

literature-reviewerSubagent

Use this agent when the user asks to "conduct literature review", "search for papers", "analyze research papers", "identify research gaps", "review related work", or mentions starting a research project. This agent integrates with Zotero for automated paper collection, organization, and full-text analysis. Examples:

paper-minerSubagent

Use this agent when the user provides a research paper (PDF/DOCX/arXiv link) or asks to learn writing patterns from papers, extract venue-specific writing signals, study paper structure, or mine rebuttal strategies. The agent writes extracted knowledge into the active installed paper-miner writing memory for ml-paper-writing. It does not maintain project-specific writing memory.

rebuttal-writerSubagent

Use this agent when the user asks to "write rebuttal", "respond to reviewers", "analyze review comments", or needs help with academic paper review response. This agent specializes in systematic rebuttal writing with professional tone and structured responses.

tdd-guideSubagent

Test-driven development guide for writing tests first, implementing the smallest passing change, and keeping verification tight. Use when the user explicitly wants TDD or when a task should be driven by failing tests before code.

analyze-resultsSlash Command

Run a blocker-first post-experiment workflow: validate evidence, produce strict statistical analysis when possible, and generate a decision-oriented results report only when the analysis bundle is sufficient. Uses results-analysis + results-report as a gated two-stage workflow.

commitSlash Command

Commit changes following Conventional Commits format (local only, no push).