Label, clean and enrich text datasets with LLMs.
Autolabel is a Python library that automates the labeling, cleaning, and enrichment of text datasets by routing them through large language models via a JSON configuration file. Users define a task type (classification, named entity recognition, question answering, and others), specify labeling guidelines and few-shot examples, select a model provider, then call a LabelingAgent to run the labeling pipeline against a CSV dataset. It connects to Claude through Anthropic's API, supporting models such as claude-3-opus-20240229 alongside OpenAI, Google, and HuggingFace-hosted models. A built-in plan step previews the final prompt and estimates cost before any labels are generated, a practical safeguard for large datasets. The library also includes a benchmarking suite that runs identical prompts across all supported models and outputs a results.csv for direct comparison. The primary audience is machine learning engineers and data scientists who need annotated training data at lower cost and faster turnaround than manual labeling workflows.
- ✓Open-source license (MIT)
- ✓Healthy fork ratio
- ✓Clear description
- ✓Topics declared
- ✓Mature repo (>1y old)
- ✓Documented (README)
- !Stale (last commit >463d ago)
git clone https://github.com/refuel-ai/autolabelTools overview
What people ask about autolabel
What is refuel-ai/autolabel?
+
refuel-ai/autolabel is tools for the Claude AI ecosystem. Label, clean and enrich text datasets with LLMs. It has 2.3k GitHub stars and was last updated 1y ago.
How do I install autolabel?
+
You can install autolabel by cloning the repository (https://github.com/refuel-ai/autolabel) or following the README instructions on GitHub. ClaudeWave also provides quick install blocks on this page.
Is refuel-ai/autolabel safe to use?
+
Our security agent has analyzed refuel-ai/autolabel and assigned a Trust Score of 80/100 (tier: Trusted). See the full breakdown of passed checks and flags on this page.
Who maintains refuel-ai/autolabel?
+
refuel-ai/autolabel is maintained by refuel-ai. The last recorded GitHub activity is from 1y ago, with 81 open issues.
Are there alternatives to autolabel?
+
Yes. On ClaudeWave you can browse similar tools at /categories/tools, sorted by popularity or recent activity.
Deploy autolabel to your cloud
Ship this repo to production in minutes. Each platform spins up its own environment with editable env vars.
Maintain this repo? Add a badge to your README
Drop the badge into your GitHub README to show it's tracked on ClaudeWave. Each badge links back to this page and reflects the live Trust Score.
[](https://claudewave.com/repo/refuel-ai-autolabel)<a href="https://claudewave.com/repo/refuel-ai-autolabel"><img src="https://claudewave.com/api/badge/refuel-ai-autolabel" alt="Featured on ClaudeWave: refuel-ai/autolabel" width="320" height="64" /></a>More Tools
A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.
An AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman
AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.
A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies