Skill4.9k estrellas del repoactualizado 12d ago

kaggle-learner

The kaggle-learner skill provides access to a curated knowledge base of techniques, code patterns, and best practices extracted from winning Kaggle competition solutions across NLP, computer vision, time series, tabular data, and multimodal domains. Use this skill when studying for Kaggle competitions, seeking proven techniques for specific machine learning tasks, or needing code templates and implementation strategies from top competitors.

Ver fuente Repositorio: claude-scholar

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/Galaxy-Dawn/claude-scholar /tmp/kaggle-learner && cp -r /tmp/kaggle-learner/skills/kaggle-learner ~/.claude/skills/kaggle-learner

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Kaggle Learner

Extract and apply knowledge from Kaggle competition winning solutions. This skill provides access to a continuously updated knowledge base of techniques, code patterns, and best practices from top Kaggle competitors.

## Overview

Kaggle competitions are at the forefront of practical machine learning. Winning solutions often innovate with novel techniques, clever feature engineering, and optimized pipelines. This skill captures that knowledge and makes it accessible for your projects.

## When to Use

Use this skill when:
- Studying for a Kaggle competition
- Looking for proven techniques in a specific domain (NLP, CV, etc.)
- Need code templates for common ML tasks
- Want to learn from competition winners

## Knowledge Categories

| Category | Focus | Directory |
|----------|-------|-----------|
| **NLP** | Text classification, NER, translation, LLM applications | `references/knowledge/nlp/` |
| **CV** | Image classification, detection, segmentation, generation | `references/knowledge/cv/` |
| **Time Series** | Forecasting, anomaly detection, sequence modeling | `references/knowledge/time-series/` |
| **Tabular** | Feature engineering, traditional ML, structured data | `references/knowledge/tabular/` |
| **Multimodal** | Cross-modal tasks, vision-language models | `references/knowledge/multimodal/` |

**文件组织结构**：每个竞赛一个独立的 markdown 文件，按 domain 分类到对应目录。

示例：
- `time-series/birdclef-plus-2025.md`
- `nlp/aimo-2-2025.md`

## Quick Reference

**To learn from a competition:**
1. Provide the Kaggle competition URL
2. The kaggle-miner agent will extract the winning solution
3. Knowledge is automatically added to the relevant category
4. **前排方案详细技术分析** (Front-runner Detailed Technical Analysis) is automatically included

**To browse existing knowledge:**
- 浏览相关 domain 目录：`references/knowledge/[domain]/`
- 每个竞赛一个独立文件，包含：
  - Competition Brief (竞赛简介)
  - **前排方案详细技术分析** (前排方案详细技术分析) ⭐
  - Code Templates (代码模板)
  - Best Practices (最佳实践)

## Self-Evolving

This skill automatically updates its knowledge base when the kaggle-miner agent processes new competitions. The more you use it, the smarter it becomes.

## Knowledge Extraction Standard

每次从 Kaggle 竞赛提取知识时，**必须**包含以下标准部分：

### 必需内容清单

| 部分 | 说明 | 必需性 |
|------|------|--------|
| **Competition Brief** | 竞赛背景、任务描述、数据规模、评估指标 | ✅ 必需 |
| **Original Summaries** | 前排方案的简要概述 | ✅ 必需 |
| **前排方案详细技术分析** | Top 20 方案的核心技巧和实现细节 | ✅ **必需** ⭐ |
| **Code Templates** | 可复用的代码模板 | ✅ 必需 |
| **Best Practices** | 最佳实践和常见陷阱 | ✅ 必需 |
| **Metadata** | 数据源标签和日期 | ✅ 必需 |

### 前排方案详细技术分析格式

每个前排方案应包含：
- **排名和团队/作者**
- **核心技巧列表** (3-6 个关键技术点)
- **实现细节** (具体的参数、配置、数据)

示例格式：
```markdown
**排名 Place - 核心技术名称 (作者)**

核心技巧：
- **技巧1**: 简短说明
- **技巧2**: 简短说明

实现细节：
- 具体参数、模型、配置
- 数据和实验结果
```

**建议覆盖 Top 20 方案，获取更多前排选手的创新技巧**

## Additional Resources

### Knowledge Directories
- **`references/knowledge/nlp/`** - NLP competition techniques
- **`references/knowledge/cv/`** - Computer vision techniques
- **`references/knowledge/time-series/`** - Time series methods
- **`references/knowledge/tabular/`** - Tabular data approaches
- **`references/knowledge/multimodal/`** - Multimodal solutions

### Competition Examples
- **BirdCLEF+ 2025** (`time-series/birdclef-plus-2025.md`) - 包含完整的 Top 14 前排方案详细技术分析
- **BirdCLEF 2024** (`time-series/birdclef-2024.md`) - 包含 Top 3 方案详细技术分析
- **AIMO-2** (`nlp/aimo-2-2025.md`) - 包含 Top 12+ 前排方案技术总结

Del mismo repositorio

code-reviewerSubagent

Expert code review specialist. Proactively reviews code for quality, security, and maintainability. Use immediately after writing or modifying code. MUST BE USED for all code changes.

kaggle-minerSubagent

Use this agent when the user provides a Kaggle competition URL or asks to learn from Kaggle winning solutions. Examples:

literature-reviewerSubagent

Use this agent when the user asks to "conduct literature review", "search for papers", "analyze research papers", "identify research gaps", "review related work", or mentions starting a research project. This agent integrates with Zotero for automated paper collection, organization, and full-text analysis. Examples:

paper-minerSubagent

Use this agent when the user provides a research paper (PDF/DOCX/arXiv link) or asks to learn writing patterns from papers, extract venue-specific writing signals, study paper structure, or mine rebuttal strategies. The agent writes extracted knowledge into the active installed paper-miner writing memory for ml-paper-writing. It does not maintain project-specific writing memory.

rebuttal-writerSubagent

Use this agent when the user asks to "write rebuttal", "respond to reviewers", "analyze review comments", or needs help with academic paper review response. This agent specializes in systematic rebuttal writing with professional tone and structured responses.

tdd-guideSubagent

Test-driven development guide for writing tests first, implementing the smallest passing change, and keeping verification tight. Use when the user explicitly wants TDD or when a task should be driven by failing tests before code.

analyze-resultsSlash Command

Run a blocker-first post-experiment workflow: validate evidence, produce strict statistical analysis when possible, and generate a decision-oriented results report only when the analysis bundle is sufficient. Uses results-analysis + results-report as a gated two-stage workflow.

commitSlash Command

Commit changes following Conventional Commits format (local only, no push).