Skill1.6k estrellas del repoactualizado today

ocr

This skill extracts text from images in multiple formats (PNG, JPG, TIFF, etc.) using the Tesseract OCR engine, supporting over 100 languages with optional preprocessing and JSON output with confidence scores. Use it when you need to digitize printed or handwritten text from images, process multilingual documents, or convert image-based content into machine-readable text for further processing.

Ver fuente Repositorio: trpc-agent-go

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/trpc-group/trpc-agent-go /tmp/ocr && cp -r /tmp/ocr/examples/skill/skills/ocr ~/.claude/skills/ocr

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# OCR Image Text Extraction Skill

Extract text from images using Tesseract OCR engine.

## Capabilities

- Extract text from image files (PNG, JPG, JPEG, GIF, BMP, TIFF)
- Support for 100+ languages
- Optional image preprocessing for better accuracy
- Output in plain text or JSON format with confidence scores

## Usage

### Basic OCR

```bash
python3 scripts/ocr.py <image_file> <output_file>
```

### With Options

```bash
# Specify language (default: eng)
python3 scripts/ocr.py image.png text.txt --lang eng

# Chinese text
python3 scripts/ocr.py image.png text.txt --lang chi_sim

# Multiple languages
python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim

# With image preprocessing (improves accuracy)
python3 scripts/ocr.py image.png text.txt --preprocess

# JSON output with confidence scores
python3 scripts/ocr.py image.png output.json --format json
```

### Download and OCR from URL

```bash
# OCR from remote image
python3 scripts/ocr_url.py <image_url> <output_file>

# With options
python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocess
```

## Parameters

- `image_file` / `image_url` (required): Path to local image or image URL
- `output_file` (required): Path to output text/JSON file
- `--lang`: Language code (e.g., eng, chi_sim, jpn, fra, deu). Default: eng
- `--preprocess`: Apply image preprocessing (grayscale, thresholding) for better accuracy
- `--format`: Output format (text/json, default: text)

## Common Languages

| Language | Code |
|----------|------|
| English | eng |
| Chinese (Simplified) | chi_sim |
| Chinese (Traditional) | chi_tra |
| Japanese | jpn |
| Korean | kor |
| French | fra |
| German | deu |
| Spanish | spa |
| Russian | rus |
| Arabic | ara |

## Supported Image Formats

PNG, JPG, JPEG, GIF, BMP, TIFF, WEBP

## Dependencies

- Python 3.8+
- pytesseract
- Pillow (PIL)
- tesseract-ocr (system package)

## Installation

```bash
# Python packages
pip install pytesseract Pillow

# Tesseract OCR engine
sudo apt-get install tesseract-ocr  # Ubuntu/Debian
sudo yum install tesseract           # CentOS/RHEL
brew install tesseract               # macOS
```

Del mismo repositorio

Demo skill that writes an output file and persists it as an artifact.

news-query-agentSubagent

Answer news queries with a fixed demo response.

weather-querySkill

Answer weather queries with a fixed demo response.

contact-lookup-agentSubagent

Look up contact phone numbers with fixed demo data.

write-okSkill

Write a deterministic OK file to out/ok.txt.