Skip to main content
ClaudeWave
Skill100 repo starsupdated 8d ago

pdf-to-word-docx

PDF conversion toolkit featuring AI layout analysis and OCR. Converts PDFs to Word/Docx, Markdown, JSON, PPT, CSV, HTML, and XML for seamless LLM data processing.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/ComPDFKit/compdf-skills /tmp/pdf-to-word-docx && cp -r /tmp/pdf-to-word-docx/skills/pdf-to-word-docx ~/.claude/skills/pdf-to-word-docx
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# pdf to word

## Purpose
- Wraps the `ComPDFKitConversion` Python SDK into a reusable local conversion workflow, supporting PDF / image to Word, PPT, Excel, HTML, RTF, Image, TXT, JSON, Markdown, and CSV (10 output formats in total).

## Agent Skills Standard Compatibility
- This Skill uses an Anthropic Agent Skills-compatible directory structure: `pdf-to-word-docx/`.
- The entry point is `SKILL.md`; helper scripts are placed in `scripts/`.
- The document uses `$ARGUMENTS` and `${CLAUDE_SKILL_DIR}` conventions for distribution and execution in Claude Code / Agent Skills-compatible environments.

## Input / Output
- Input: The target format (`word`/`excel`/`ppt`/`html`/`rtf`/`image`/`txt`/`json`/`markdown`/`csv`), the PDF or image path, and the output path are passed via Skill arguments or the command line. An optional PDF password and conversion parameters may also be provided.
- Supported input file types:
  - PDF files (`.pdf`)
  - Image files (`.jpg`/`.jpeg`/`.png`/`.bmp`/`.tif`/`.tiff`/`.webp`/`.jp2`/`.gif`/`.tga`)
- Output: A file in the corresponding format (`.docx`, `.pptx`, `.xlsx`, `.html`, `.rtf`, image, `.txt`, `.json`, `.md`, `.csv`), or a clear error message.

## Prerequisites
- Supports Windows and macOS.
- The conversion SDK must be installed first:
  ```bash
  pip install ComPDFKitConversion
  ```
- On first run, the script automatically downloads `license.xml` from the ComPDF server and caches it in the `scripts/` directory:
  ```text
  https://download.compdf.com/skills/license/license.xml
  ```
- The script reads the `<key>...</key>` field from `license.xml` and uses that key for `LibraryManager.license_verify(...)` authentication — it does not pass the XML file path directly to the SDK.
- To use a custom license, place your own `license.xml` in the `scripts/` directory; the script will use it directly without downloading.
- During SDK initialization, the `resource` directory is always set to the directory containing `pdf-to-word-docx.py`, i.e., the `scripts/` directory itself.
- When `--enable-ocr` or `--enable-ai-layout` (enabled by default) is used, the Skill also requires `scripts/documentai.model`. If the file does not exist, the script will automatically download it from:
  ```text
  https://download.compdf.com/skills/model/documentai.model
  ```
- To reuse an existing model file, you can override the default model path via an environment variable:
  ```bash
  export COMPDF_DOCUMENT_AI_MODEL="/path/to/documentai.model"
  ```

## Workflow
1. Confirm the Python package is installed:
   ```bash
   python -m pip show ComPDFKitConversion
   ```
2. The script automatically downloads `license.xml` on first run; the `scripts/` directory is used directly as the SDK `resource` path.
3. In Agent Skills / Claude Code environments, prefer using the Skill's built-in script path variable:
   ```bash
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" ppt input.pdf output.pptx
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" excel input.pdf output.xlsx
   ```
4. For more control, append common parameters:
   ```bash
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" excel input.pdf output.xlsx --page-ranges "1-3,5" --excel-all-content --excel-worksheet-option for-page
   python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx --enable-ocr --page-layout-mode flow
   ```
5. On startup, the script ensures `scripts/license.xml` exists (downloading it automatically from the ComPDF server if missing), reads the `<key>` field for SDK authentication, and uses the `scripts/` directory as the `resource` path.
6. If `--enable-ocr` or `--enable-ai-layout` (enabled by default) is active, the script checks whether `scripts/documentai.model` exists; if not, it downloads the file automatically before initializing the Document AI model.
7. Check the return code; if it is not `SUCCESS`, handle license, password, resource, model, or input file issues according to the error name.

## documentai.model Download Optimization
- The script preferentially uses the model file pointed to by `COMPDF_DOCUMENT_AI_MODEL`.
- The default model path is `scripts/documentai.model`.
- During automatic download, the file is first written to `documentai.model.part` and then atomically renamed to the final file upon success, preventing partial file corruption.
- On download failure, the script retries automatically with back-off intervals of `2s / 5s / 10s`.

## Invoking Directly as a Skill
- In environments that support Agent Skills, the Skill can be called directly:
  ```text
  /pdf-to-word-docx word input.pdf output.docx
  /pdf-to-word-docx excel input.pdf output.xlsx --excel-worksheet-option for-page
  ```
- When the Skill receives arguments, it passes them through to the script as-is:
  ```bash
  python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" $ARGUMENTS
  ```
- If the environment does not support direct Skill invocation, fall back to a regular command-line call.

## Supported Output Formats
- `word` → calls `CPDFConversion.start_pdf_to_word`
- `excel` → calls `CPDFConversion.start_pdf_to_excel`
- `ppt` → calls `CPDFConversion.start_pdf_to_ppt`
- `html` → calls `CPDFConversion.start_pdf_to_html`
- `rtf` → calls `CPDFConversion.start_pdf_to_rtf`
- `image` → calls `CPDFConversion.start_pdf_to_image`
- `txt` → calls `CPDFConversion.start_pdf_to_txt`
- `json` → calls `CPDFConversion.start_pdf_to_json`
- `markdown` → calls `CPDFConversion.start_pdf_to_markdown`
- `csv` → reuses `CPDFConversion.start_pdf_to_excel` with table/Excel parameters to produce CSV-friendly output

## Input Source Types
- The script supports **PDF and image** as input sources. The SDK's `start_pdf_to_*` interfaces natively accept image files with no pre-processing required.
- By default, the script auto-detects the input type from the file extension:
  - `.pdf` → `pdf`
  - `.png/.jpg/.jpeg/.bmp