docx
The docx skill enables creation, reading, editing, and manipulation of Word documents (.docx files) using tools like pandoc for text extraction, docx-js JavaScript library for generating new documents with formatting like tables and headers, and XML unpacking for advanced edits. Use this skill when users request Word documents, reports, memos, letters, or templates with professional formatting, or when extracting, reorganizing, or converting content within .docx files.
git clone --depth 1 https://github.com/melandlabs/openloomi /tmp/docx && cp -r /tmp/docx/skills/docx ~/.claude/skills/docxSKILL.md
# DOCX creation, editing, and analysis
## Overview
A .docx file is a ZIP archive containing XML files.
## Quick Reference
| Task | Approach |
| ---------------------- | ----------------------------------------------------------------- |
| Read/analyze content | `pandoc` or unpack for raw XML |
| Create new document | Use `docx-js` - see Creating New Documents below |
| Edit existing document | Unpack → edit XML → repack - see Editing Existing Documents below |
### Converting .doc to .docx
Legacy `.doc` files are **auto-converted** on read and write — no manual step needed:
```bash
# Reading .doc (auto-converted to .docx internally)
python scripts/office/unpack.py legacy.doc unpacked/
# Writing .doc (auto-converted from .docx)
python scripts/office/pack.py unpacked/ output.doc
```
LibreOffice is used for the conversion, with automatic sandbox-friendly socket handling.
### Reading Content
```bash
# Text extraction with tracked changes
pandoc --track-changes=all document.docx -o output.md
# Raw XML access
python scripts/office/unpack.py document.docx unpacked/
```
### Converting to Images
```bash
python scripts/office/soffice.py --headless --convert-to pdf document.docx
pdftoppm -jpeg -r 150 document.pdf page
```
### Accepting Tracked Changes
To produce a clean document with all tracked changes accepted (requires LibreOffice):
```bash
python scripts/accept_changes.py input.docx output.docx
```
---
## Creating New Documents
Generate .docx files with JavaScript, then validate. Install: `npm install -g docx`
### Setup
```javascript
const {
Document,
Packer,
Paragraph,
TextRun,
Table,
TableRow,
TableCell,
ImageRun,
Header,
Footer,
AlignmentType,
PageOrientation,
LevelFormat,
ExternalHyperlink,
TableOfContents,
HeadingLevel,
BorderStyle,
WidthType,
ShadingType,
VerticalAlign,
PageNumber,
PageBreak,
} = require("docx");
const doc = new Document({
sections: [
{
children: [
/* content */
],
},
],
});
Packer.toBuffer(doc).then((buffer) => fs.writeFileSync("doc.docx", buffer));
```
### Validation
After creating the file, validate it. If validation fails, unpack, fix the XML, and repack.
```bash
python scripts/office/validate.py doc.docx
```
### Page Size
```javascript
// CRITICAL: docx-js defaults to A4, not US Letter
// Always set page size explicitly for consistent results
sections: [
{
properties: {
page: {
size: {
width: 12240, // 8.5 inches in DXA
height: 15840, // 11 inches in DXA
},
margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 }, // 1 inch margins
},
},
children: [
/* content */
],
},
];
```
**Common page sizes (DXA units, 1440 DXA = 1 inch):**
| Paper | Width | Height | Content Width (1" margins) |
| ------------ | ------ | ------ | -------------------------- |
| US Letter | 12,240 | 15,840 | 9,360 |
| A4 (default) | 11,906 | 16,838 | 9,026 |
**Landscape orientation:** docx-js swaps width/height internally, so pass portrait dimensions and let it handle the swap:
```javascript
size: {
width: 12240, // Pass SHORT edge as width
height: 15840, // Pass LONG edge as height
orientation: PageOrientation.LANDSCAPE // docx-js swaps them in the XML
},
// Content width = 15840 - left margin - right margin (uses the long edge)
```
### Styles (Override Built-in Headings)
Use Arial as the default font (universally supported). Keep titles black for readability.
```javascript
const doc = new Document({
styles: {
default: { document: { run: { font: "Arial", size: 24 } } }, // 12pt default
paragraphStyles: [
// IMPORTANT: Use exact IDs to override built-in styles
{
id: "Heading1",
name: "Heading 1",
basedOn: "Normal",
next: "Normal",
quickFormat: true,
run: { size: 32, bold: true, font: "Arial" },
paragraph: { spacing: { before: 240, after: 240 }, outlineLevel: 0 },
}, // outlineLevel required for TOC
{
id: "Heading2",
name: "Heading 2",
basedOn: "Normal",
next: "Normal",
quickFormat: true,
run: { size: 28, bold: true, font: "Arial" },
paragraph: { spacing: { before: 180, after: 180 }, outlineLevel: 1 },
},
],
},
sections: [
{
children: [
new Paragraph({
heading: HeadingLevel.HEADING_1,
children: [new TextRun("Title")],
}),
],
},
],
});
```
### Lists (NEVER use unicode bullets)
```javascript
// ❌ WRONG - never manually insert bullet characters
new Paragraph({ children: [new TextRun("• Item")] }); // BAD
new Paragraph({ children: [new TextRun("\u2022 Item")] }); // BAD
// ✅ CORRECT - use numbering config with LevelFormat.BULLET
const doc = new Document({
numbering: {
config: [
{
reference: "bullets",
levels: [
{
level: 0,
format: LevelFormat.BULLET,
text: "•",
alignment: AlignmentType.LEFT,
style: { paragraph: { indent: { left: 720, hanging: 360 } } },
},
],
},
{
reference: "numbers",
levels: [
{
level: 0,
format: LevelFormat.DECIMAL,
text: "%1.",
alignment: AlignmentType.LEFT,
style: { paragraph: { indent: { left: 720, hanging: 360 } } },
},
],
},
],
},
sections: [
{
children: [
new Paragraph({
numbering: { reference: "bullets", level: 0 },
children: [new TextRun("Bullet item")],
}),
new Paragraph({
numbering: { reference: "numbers", level: 0 },
children: [new TextRun("Numbered item")]Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
Drive a native macOS app via the cua-driver CLI (default) or MCP server — snapshot its AX tree, click/type/scroll by element_index, verify via re-snapshot. Use when the user asks you to operate, drive, automate, or perform a GUI task in a real macOS application on the host (e.g. "open a file in TextEdit", "navigate to /Applications in Finder", "click the Save button in Numbers").
Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, artifacts, posters, or applications (examples include websites, landing pages, dashboards, React components, HTML/CSS layouts, or when styling/beautifying any web UI). Generates creative, polished code and UI design that avoids generic AI aesthetics.
openloomi API documentation and reference. Use when working with openloomi backend APIs, AI, authentication, characters, messages, files, integrations, billing, or any server-side functionality. Triggers: API endpoints, backend routes, authentication, cloud API, integrations
openloomi Connectors tools - manage platform integrations (OAuth connections, list accounts, check status). Triggers: connect platform, integration status, list accounts, disconnect
Use this when users ask about openloomi features, capabilities, or how to use it. Examples: 'openloomi 怎么用', '你能做什么', 'What can you do?', 'How does openloomi work?', 'Tell me about openloomi features', 'What platforms does openloomi support?', 'How do I use scheduled tasks?', 'What is Insights system?', 'How do I connect Telegram?', 'How to create automation?', '什么是 openloomi 事件?
openloomi Memory tools - search memory files, knowledge base, and chat insights. Triggers: memory search, knowledge base, documents, insights
Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.