Skip to main content
ClaudeWave
Skill228 estrellas del repoactualizado yesterday

processing-pdfs

Processes PDF files. Extracts text and tables, fills forms, merges and splits documents, batch-processes files, converts to images, and generates PDFs programmatically. Use when working with .pdf files. Do NOT use for Word documents, spreadsheets, or presentations.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/telagod/code-abyss /tmp/processing-pdfs && cp -r /tmp/processing-pdfs/skills/processing-pdfs ~/.claude/skills/processing-pdfs
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# PDF Processing

Essential PDF operations using Python libraries and CLI tools.

## Decision Matrix

| Task | Best Tool | Reference |
|------|-----------|-----------|
| Merge / split / metadata / rotate | pypdf | [recipes.md](references/recipes.md) |
| Extract text (layout preserved) | pdfplumber | [recipes.md](references/recipes.md) |
| Extract tables | pdfplumber | [recipes.md](references/recipes.md) |
| Create new PDF | reportlab | [recipes.md](references/recipes.md) |
| Batch CLI ops | qpdf / pdftk | [recipes.md](references/recipes.md) |
| OCR scanned PDFs | pytesseract + pdf2image | [advanced.md](references/advanced.md) |
| Add watermark / extract images / encrypt | pypdf / pdfimages | [advanced.md](references/advanced.md) |
| Fill PDF forms | pdf-lib / pypdf | [FORMS.md](FORMS.md) |
| Advanced pypdfium2 / pdf-lib JS | — | [REFERENCE.md](REFERENCE.md) |

## Quick Start

```python
from pypdf import PdfReader
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")
text = "".join(page.extract_text() for page in reader.pages)
```

## Workflow

1. **Identify task** — text extraction? table? creation? form? Pick row from matrix above.
2. **Load reference** — recipes.md covers 90% of tasks; advanced.md for OCR / encrypt; FORMS.md for forms.
3. **Implement** — copy-adapt recipe; verify output.
4. **Validate** — open in a viewer or grep extracted text.

## Library Selection

| Library | Use for |
|---------|---------|
| pypdf | Merge, split, metadata, encryption, rotation |
| pdfplumber | Text extraction with layout, tables |
| reportlab | Generate PDFs programmatically |
| pdf2image + pytesseract | OCR scanned documents |
| qpdf / pdftk (CLI) | Batch ops, no Python needed |
analyzing-changesSkill

Analyzes code changes, detects documentation drift, and evaluates change impact scope. Use when reviewing diffs, checking doc sync, or running pre-commit analysis. Automatically triggered after design-level changes or refactoring.

analyzing-securitySkill

Scans code for security vulnerabilities, detects dangerous patterns, and ensures security decisions are documented. Use when running security scans, auditing code, or checking for OWASP issues, injection risks, or sensitive data leaks. Automatically triggered on new modules, security-related changes, or post-refactor.

analyzing-spreadsheetsSkill

Processes Excel spreadsheet files (.xlsx, .xlsm, .csv). Creates workbooks, builds formulas, preserves formatting, analyzes tabular data, and validates financial models with zero-formula-error delivery. Use when working with spreadsheet files or tabular data analysis. Do NOT use for Word documents, PDFs, presentations, or database pipelines.

applying-ui-design-systemSkill

Frontend UI design system selector and implementation guide covering Glassmorphism, Liquid Glass (Apple-style), Neubrutalism, and Claymorphism. Use when building UI components, choosing a visual aesthetic, implementing design tokens, or auditing accessibility/contrast on themed surfaces. Provides per-style tokens, component patterns, dark mode, and a11y constraints.

architecting-securitySkill

安全架构与治理:威胁建模 (STRIDE/PASTA/LINDDUN)、零信任身份架构、IAM/SSO/MFA/PAM、合规框架 (SOC2/PCI/HIPAA/GDPR)、DLP、隐私工程、安全控制设计。Use when designing security architecture, threat modeling new systems, implementing zero-trust identity, designing IAM/SSO/PAM, building compliance evidence chains, or planning privacy-by-design.

automating-devopsSkill

DevOps knowledge reference covering Git workflows, testing strategies, DevSecOps, release pipeline orchestration (release.yml, multi-arch images, cosign integration), CI/CD pipelines, database management, observability, and performance optimization. Use when working with Git, CI/CD, release pipelines, ghcr image publishing, testing, monitoring, or infrastructure automation.

building-agent-systemsSkill

AI agent and LLM system engineering reference covering single-agent dev (ReAct, tool calling, plan-execute), multi-agent coordination (swarm, role decomposition, file locking), LLM security (prompt injection, jailbreak defense, output filtering), RAG architecture (chunking, hybrid retrieval, rerank), and prompt engineering / evaluation (RAGAS, LLM-as-Judge). Use when building AI agents, designing RAG pipelines, orchestrating multi-agent workflows, hardening LLM apps, or writing prompts.

checking-code-qualitySkill

Checks code quality metrics including complexity, duplication, naming conventions, and function length. Use when running quality gates, reviewing code smells, or checking lint rules. Automatically triggered on complex modules or post-refactor.