Skill101 repo starsupdated 27d ago

pdf-to-word-docx

This Claude Code skill wraps the ComPDFKit conversion SDK to transform PDF and image files into ten output formats including Word, Excel, PowerPoint, HTML, Markdown, JSON, and CSV. Use it when you need to extract structured data from PDFs for LLM processing, convert documents to editable formats, or perform OCR on image files with AI-powered layout analysis.

View source Repository: compdf-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/ComPDFKit/compdf-skills /tmp/pdf-to-word-docx && cp -r /tmp/pdf-to-word-docx/skills/pdf-to-word-docx ~/.claude/skills/pdf-to-word-docx

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# PDF to Word Converter

## Purpose
- Wraps the `ComPDFKitConversion` Python SDK into a reusable local conversion workflow, supporting PDF / image to Word, PPT, Excel, HTML, RTF, Image, TXT, JSON, Markdown, and CSV (10 output formats in total).

## Agent Skills Standard Compatibility
- This Skill uses an Anthropic Agent Skills-compatible directory structure: `pdf-to-word-docx/`.
- The entry point is `SKILL.md`; helper scripts are placed in `scripts/`.
- The document uses `$ARGUMENTS` and `${CLAUDE_SKILL_DIR}` conventions for distribution and execution in Claude Code / Agent Skills-compatible environments.

## Input / Output
- Input: The target format (`word`/`excel`/`ppt`/`html`/`rtf`/`image`/`txt`/`json`/`markdown`/`csv`), the PDF or image path, and the output path are passed via Skill arguments or the command line. An optional PDF password and conversion parameters may also be provided.
- Supported input file types:
  - PDF files (`.pdf`)
  - Image files (`.jpg`/`.jpeg`/`.png`/`.bmp`/`.tif`/`.tiff`/`.webp`/`.jp2`/`.gif`/`.tga`)
- Output: A file in the corresponding format (`.docx`, `.pptx`, `.xlsx`, `.html`, `.rtf`, image, `.txt`, `.json`, `.md`, `.csv`), or a clear error message.

## Prerequisites
- Supports Windows and macOS.
- The conversion SDK must be installed first:
  ```bash
  pip install ComPDFKitConversion
  ```
- On first run, the script automatically downloads `license.xml` from the ComPDF server and caches it in the `scripts/` directory:
  ```text
  https://download.compdf.com/skills/license/license.xml
  ```
- The script reads the `<key>...</key>` field from `license.xml` and uses that key for `LibraryManager.license_verify(...)` authentication — it does not pass the XML file path directly to the SDK.
- To use a custom license, place your own `license.xml` in the `scripts/` directory; the script will use it directly without downloading.
- During SDK initialization, the `resource` directory is always set to the directory containing `pdf-to-word-docx.py`, i.e., the `scripts/` directory itself.
- When `--enable-ocr` or `--enable-ai-layout` (enabled by default) is used, the Skill also requires `scripts/documentai.model`. If the file does not exist, the script will automatically download it from:
  ```text
  https://download.compdf.com/skills/model/documentai.model
  ```
- To reuse an existing model file, you can override the default model path via an environment variable:
  ```bash
  export COMPDF_DOCUMENT_AI_MODEL="/path/to/documentai.model"
  ```

## Workflow
1. Confirm the Python package is installed:
   ```bash
   python -m pip show ComPDFKitConversion
   ```
2. The script automatically downloads `license.xml` on first run; the `scripts/` directory is used directly as the SDK `resource` path.
3. In Agent Skills / Claude Code environments, prefer using the Skill's built-in script path variable:
   ```bash
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" ppt input.pdf output.pptx
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" excel input.pdf output.xlsx
   ```
4. For more control, append common parameters:
   ```bash
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" excel input.pdf output.xlsx --page-ranges "1-3,5" --excel-all-content --excel-worksheet-option for-page
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx --enable-ocr --page-layout-mode flow
   ```
5. On startup, the script ensures `scripts/license.xml` exists (downloading it automatically from the ComPDF server if missing), reads the `<key>` field for SDK authentication, and uses the `scripts/` directory as the `resource` path.
6. If `--enable-ocr` or `--enable-ai-layout` (enabled by default) is active, the script checks whether `scripts/documentai.model` exists; if not, it downloads the file automatically before initializing the Document AI model.
7. Check the return code; if it is not `SUCCESS`, handle license, password, resource, model, or input file issues according to the error name.

## documentai.model Download Optimization
- The script preferentially uses the model file pointed to by `COMPDF_DOCUMENT_AI_MODEL`.
- The default model path is `scripts/documentai.model`.
- During automatic download, the file is first written to `documentai.model.part` and then atomically renamed to the final file upon success, preventing partial file corruption.
- On download failure, the script retries automatically with back-off intervals of `2s / 5s / 10s`.

## Invoking Directly as a Skill
- In environments that support Agent Skills, the Skill can be called directly:
  ```text
  /pdf-to-word-docx word input.pdf output.docx
  /pdf-to-word-docx excel input.pdf output.xlsx --excel-worksheet-option for-page
  ```
- When the Skill receives arguments, it passes them through to the script as-is:
  ```bash
  python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" $ARGUMENTS
  ```
- If the environment does not support direct Skill invocation, fall back to a regular command-line call.

## Supported Output Formats
- `word` → calls `CPDFConversion.start_pdf_to_word`
- `excel` → calls `CPDFConversion.start_pdf_to_excel`
- `ppt` → calls `CPDFConversion.start_pdf_to_ppt`
- `html` → calls `CPDFConversion.start_pdf_to_html`
- `rtf` → calls `CPDFConversion.start_pdf_to_rtf`
- `image` → calls `CPDFConversion.start_pdf_to_image`
- `txt` → calls `CPDFConversion.start_pdf_to_txt`
- `json` → calls `CPDFConversion.start_pdf_to_json`
- `markdown` → calls `CPDFConversion.start_pdf_to_markdown`
- `csv` → reuses `CPDFConversion.start_pdf_to_excel` with table/Excel parameters to produce CSV-friendly output

## Input Source Types
- The script supports **PDF and image** as input sources. The SDK's `start_pdf_to_*` interfaces natively accept image files with no pre-processing required.
- By default, the script auto-detects the input type from the file extension:
  - `.pdf` → `pdf`
  - `.png/.jpg/