Skill1.3k repo starsupdated today
ocr
This skill extracts text from images in multiple formats (PNG, JPG, TIFF, etc.) using the Tesseract OCR engine, supporting over 100 languages with optional preprocessing and JSON output with confidence scores. Use it when you need to digitize printed or handwritten text from images, process multilingual documents, or convert image-based content into machine-readable text for further processing.
Install in Claude Code
Copygit clone --depth 1 https://github.com/trpc-group/trpc-agent-go /tmp/ocr && cp -r /tmp/ocr/examples/skill/skills/ocr ~/.claude/skills/ocrThen start a new Claude Code session; the skill loads automatically.
Definition
SKILL.md
# OCR Image Text Extraction Skill Extract text from images using Tesseract OCR engine. ## Capabilities - Extract text from image files (PNG, JPG, JPEG, GIF, BMP, TIFF) - Support for 100+ languages - Optional image preprocessing for better accuracy - Output in plain text or JSON format with confidence scores ## Usage ### Basic OCR ```bash python3 scripts/ocr.py <image_file> <output_file> ``` ### With Options ```bash # Specify language (default: eng) python3 scripts/ocr.py image.png text.txt --lang eng # Chinese text python3 scripts/ocr.py image.png text.txt --lang chi_sim # Multiple languages python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim # With image preprocessing (improves accuracy) python3 scripts/ocr.py image.png text.txt --preprocess # JSON output with confidence scores python3 scripts/ocr.py image.png output.json --format json ``` ### Download and OCR from URL ```bash # OCR from remote image python3 scripts/ocr_url.py <image_url> <output_file> # With options python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocess ``` ## Parameters - `image_file` / `image_url` (required): Path to local image or image URL - `output_file` (required): Path to output text/JSON file - `--lang`: Language code (e.g., eng, chi_sim, jpn, fra, deu). Default: eng - `--preprocess`: Apply image preprocessing (grayscale, thresholding) for better accuracy - `--format`: Output format (text/json, default: text) ## Common Languages | Language | Code | |----------|------| | English | eng | | Chinese (Simplified) | chi_sim | | Chinese (Traditional) | chi_tra | | Japanese | jpn | | Korean | kor | | French | fra | | German | deu | | Spanish | spa | | Russian | rus | | Arabic | ara | ## Supported Image Formats PNG, JPG, JPEG, GIF, BMP, TIFF, WEBP ## Dependencies - Python 3.8+ - pytesseract - Pillow (PIL) - tesseract-ocr (system package) ## Installation ```bash # Python packages pip install pytesseract Pillow # Tesseract OCR engine sudo apt-get install tesseract-ocr # Ubuntu/Debian sudo yum install tesseract # CentOS/RHEL brew install tesseract # macOS ```
More from this repository
blogSkill
enSkill
zhSkill
artifact_demoSkill
Demo skill that writes an output file and persists it as an artifact.
news-query-agentSubagent
Answer news queries with a fixed demo response.
weather-querySkill
Answer weather queries with a fixed demo response.
contact-lookup-agentSubagent
Look up contact phone numbers with fixed demo data.
write-okSkill
Write a deterministic OK file to out/ok.txt.