Skip to main content
ClaudeWave
Skill1.1k repo starsupdated today

oma-hwp

The oma-hwp skill converts Korean HWP-family documents (.hwp, .hwpx, .hwpml) into Markdown or JSON format while preserving structure like headings, tables, lists, and images. Use this skill when preparing Korean government documents, enterprise files, or Hangul word processor content for language model context, retrieval-augmented generation pipelines, or document review workflows.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/first-fluke/oh-my-agent /tmp/oma-hwp && cp -r /tmp/oma-hwp/benchmarks/runs/oma/.agents/skills/oma-hwp ~/.claude/skills/oma-hwp
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# HWP Skill - HWP / HWPX / HWPML to Markdown Conversion

## Scheduling

### Goal
Convert Korean HWP-family documents into readable Markdown or structured JSON while preserving document structure for LLM context, RAG, government-document review, or enterprise document processing.

### Intent signature
- User asks to convert, parse, read, extract, or transform `.hwp`, `.hwpx`, or `.hwpml`.
- User mentions Korean word processor files, Hangul documents, government forms, or "한글 파일".
- User needs headings, tables, nested tables, lists, images, footnotes, or hyperlinks extracted from HWP-family files.

### When to use
- Converting Korean HWP documents (`.hwp`, `.hwpx`, `.hwpml`) to Markdown
- Preparing Korean government/enterprise documents for LLM context or RAG
- Extracting structured content (tables, headings, lists, images) from HWP
- User says "convert this HWP", "parse hwpx", "HWP to markdown", "한글 파일"

### When NOT to use
- PDF files -> use `oma-pdf` (OCR + Tagged PDF specialization)
- XLSX / DOCX files -> currently out of scope (may be covered by a future `oma-docs`)
- Generating or editing HWP documents -> out of scope
- Already-text files -> use Read tool directly

### Expected inputs
- `input_path`: `.hwp`, `.hwpx`, or `.hwpml` file path
- `output_path` or `output_dir`: optional explicit output target
- `format`: optional output format, default `markdown`
- `page_range`: optional page or section range
- `kordoc_version`: optional pinned kordoc version

### Expected outputs
- Markdown output next to the input file or in the requested directory
- Optional JSON output when requested
- Post-processed Markdown with flattened GFM tables and stripped Private Use Area glyphs by default
- A short report with output path, detected source format, and conversion issues

### Dependencies
- `bun` and `bunx`
- `bunx kordoc@latest` or configured pinned kordoc version
- `resources/flatten-tables.ts` for Markdown cleanup
- Local filesystem access to input and output paths

### Control-flow features
- Branches by file extension, output target, format, page range, encryption/DRM state, and post-processing requirements
- Calls external CLI tools through `bunx` and `bun run`
- Reads local HWP-family files and writes local Markdown or JSON output
- Routes non-HWP inputs to other skills instead of stretching this skill's scope

## Structural Flow

### Entry
1. Confirm the input path exists.
2. Confirm the extension is `.hwp`, `.hwpx`, or `.hwpml`.
3. Resolve output path or directory and default filename.
4. Check that `bun` is available.

### Scenes
1. **PREPARE**: Validate path, extension, size, output target, and requested format.
2. **ACQUIRE**: Detect source format and runtime availability.
3. **ACT**: Run `kordoc` with explicit output target and requested options.
4. **VERIFY**: Post-process Markdown and inspect structure for headings, tables, lists, images, and footnotes.
5. **FINALIZE**: Report output path, source format, and any conversion limitations.

### Transitions
- If the input is `.pdf`, stop and route to `oma-pdf`.
- If the input is `.xlsx` or `.docx`, explain that this skill does not advertise those formats.
- If `bun` is unavailable, stop and ask the user to install Bun.
- If Markdown is produced, run `resources/flatten-tables.ts` unless the caller explicitly needs HTML tables or PUA glyphs preserved.
- If output is empty or garbled, consult `resources/troubleshooting.md`.

### Failure and recovery
| Failure | Recovery |
|---------|----------|
| `bun` or `bunx` unavailable | Ask user to install Bun |
| Unsupported or mismatched format | Check extension and magic bytes, then route or stop |
| Encrypted or DRM-locked document | Report limitation and request an accessible copy when needed |
| Empty Markdown output | Treat as possible scanned-image content and recommend OCR outside this skill |
| Complex merged tables | Accept flattened Markdown or HTML fallback as best effort |
| Stale kordoc cache | Use `bunx kordoc@latest` or configured pinned version |

### Exit
- Success: output file exists and structure is readable after post-processing.
- Partial success: output exists with explicitly reported table, glyph, encryption, or fidelity limitations.
- Failure: no reliable output is produced and the blocking cause is reported.

## Logical Operations

### Actions
| Action | SSL primitive | Evidence |
|--------|---------------|----------|
| Validate file path and extension | `VALIDATE` | Input preflight in execution protocol |
| Check runtime availability | `VALIDATE` | `bun --version` |
| Select output target and format | `SELECT` | Output behavior and config |
| Run converter | `CALL_TOOL` | `bunx kordoc@latest` |
| Write output artifact | `WRITE` | Markdown or JSON output |
| Flatten tables and strip PUA glyphs | `CALL_TOOL` | `resources/flatten-tables.ts` |
| Inspect extraction quality | `VALIDATE` | Verification step |
| Report result | `NOTIFY` | Final user-facing summary |

### Tools and instruments
- `kordoc`: primary HWP-family conversion CLI
- `flatten-tables.ts`: post-processing for GFM tables and Hancom PUA cleanup
- `bun` / `bunx`: runtime and CLI executor

### Canonical command path
```bash
bunx kordoc@latest "{input_path}" -o "{output_path}"
bun run ".agents/skills/oma-hwp/resources/flatten-tables.ts" "{output_path}"
```

For batch conversion, use an explicit output directory:
```bash
bunx kordoc@latest "{input_pattern}" -d "{output_dir}"
```

### Resource scope
| Scope | Resource target |
|-------|-----------------|
| `LOCAL_FS` | Input HWP-family files and generated outputs |
| `PROCESS` | `bunx kordoc` and `bun run` subprocesses |
| `MEMORY` | Format decisions, validation notes, and final report |

### Preconditions
- Input file exists and is readable.
- Output location is writable or can be created.
- `bun` is installed.
- `kordoc` can parse the document or fail with a reportable error.

### Effects and side effects
- Creates Markdown or JSON output files.
- May flatten merg
oma-academic-writerSkill

>

oma-architectureSkill

Architecture specialist for software/system design, module and service boundaries, tradeoff analysis, and stakeholder synthesis. Uses context-aware methods such as diagnostic routing, design-twice comparison, ATAM-style risk analysis, CBAM-style prioritization, and ADR-style decision records.

oma-backendSkill

Backend specialist for APIs, databases, authentication with clean architecture (Repository/Service/Router pattern). Use for API, endpoint, REST, database, server, migration, and auth work.

oma-brainstormSkill

Design-first ideation that explores user intent, constraints, and approaches before any planning or implementation. Use for brainstorming, ideation, exploring concepts, and evaluating approaches.

oma-coordinationSkill

Guide for coordinating PM, Frontend, Backend, Mobile, and QA agents on complex projects via CLI. Use for manual step-by-step coordination and workflow guidance.

oma-dbSkill

Database specialist for SQL, NoSQL, and vector database modeling, schema design, normalization, indexing, transactions, integrity, concurrency control, backup, capacity planning, data standards, anti-pattern review, and compliance-aware database design. Use for database, schema, ERD, table design, document model, vector index design, RAG retrieval architecture, migration, query tuning, glossary, capacity estimation, backup strategy, database anti-pattern remediation work, and ISO 27001, ISO 27002, or ISO 22301-aware database recommendations.

oma-debugSkill

Bug diagnosis and fixing specialist - analyzes errors, identifies root causes, provides fixes, and writes regression tests. Use for bug, debug, error, crash, traceback, exception, and regression work.

oma-deepsecSkill

>