Skip to main content
ClaudeWave
Skill313 repo starsupdated 2d ago

personal-knowledge-capture

personal-knowledge-capture maintains a local-first system for converting user-approved documents and URLs into cited Markdown notes. Register folders explicitly, scan them on demand to detect new or modified files, extract text from Markdown, TXT, PDF, and DOCX sources, and generate daily summaries grounded in source references stored in SQLite. Use this skill when a user wants to build a personal knowledge base from local research folders or specific web pages without uploading content to external services.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/Prompthon-IO/agent-systems-handbook /tmp/personal-knowledge-capture && cp -r /tmp/personal-knowledge-capture/skills/personal-knowledge-capture ~/.claude/skills/personal-knowledge-capture
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Personal Knowledge Capture

Use this skill to maintain local-first personal knowledge notes from user-approved sources. Register folders only when the user names them, scan registered folders on demand, extract text from supported file types, and write cited Markdown notes.

For the human-facing overview, read `README.md`. For supported file type details, read `references/supported-file-types.md` when extraction behavior matters.

## Safety Rules

- Scan only folders the user explicitly names or previously registered with `add-watch`.
- Do not upload local files to external services unless the user explicitly asks and credentials are available.
- Preserve source files exactly; never move, rewrite, delete, or rename them.
- Keep runtime databases, scan artifacts, and generated notes outside git unless the user explicitly asks for sample artifacts.
- Ground generated notes in source references. Do not present uncited claims as coming from the captured files.
- Treat `capture-url` as opt-in only: fetch only URLs the user explicitly provides.

## Default Workflow

1. Register a watch path with `scripts/personal_knowledge_capture.py add-watch --path <folder>`.
2. Run `scripts/personal_knowledge_capture.py scan` when the user wants a detection preview only; this does not mark files as captured.
3. Run `scripts/personal_knowledge_capture.py summarize` when the user wants a dated Markdown note; this command performs its own scan.
4. Review the generated note and refine summaries with Codex when deeper synthesis is needed.
5. Report source paths, skipped files, and the note path to the user.

## Commands

Resolve `scripts/personal_knowledge_capture.py` relative to this skill directory. When running from an installed Codex copy, that is usually:

```bash
python3 scripts/personal_knowledge_capture.py add-watch --path "$HOME/research" --tags "ai,research"
```

Scan registered watch paths:

```bash
python3 scripts/personal_knowledge_capture.py scan
```

Scan and write today's Markdown summary:

```bash
python3 scripts/personal_knowledge_capture.py summarize
```

Capture an explicitly provided URL:

```bash
python3 scripts/personal_knowledge_capture.py capture-url --url "https://example.com/article"
```

Use `--state-dir <path>` only when a user asks to place runtime state somewhere specific.

## Persistence

The helper stores local runtime artifacts under:

```text
~/.codex/state/personal-knowledge-capture/
  knowledge_capture.db
  runs/
  knowledge-notes/
```

SQLite tables:

```sql
watch_paths(id, path, tags, created_at, last_scanned_at)
documents(id, path_or_url, title, hash, source_mtime, captured_at, summary_path)
```

`source_mtime` is an implementation addition beyond the base issue schema. It stores
the source file's last-modified timestamp so the scanner can skip files whose
content hash and mtime are both unchanged, reducing redundant re-extraction.

## Markdown Output

Daily summaries are written to:

```text
knowledge-notes/YYYY-MM-DD-summary.md
```

The generated note must keep this structure:

```md
# Summary

## New Files

## Key Insights

## Actionable Notes

## Open Questions

## Source References
```

If Codex rewrites or enriches the summary, preserve the section headings and keep every source-grounded point tied to `Source References`.

## Supported Sources

- Markdown: `.md`, `.markdown`
- Plain text: `.txt`
- DOCX: basic built-in text extraction from `word/document.xml`
- PDF: supported when the optional `pypdf` dependency is installed
- URLs: supported only through explicit `capture-url --url <url>`

For unsupported or failed extraction, report the skipped source and reason instead of silently dropping it.

## Response Pattern

When reporting results to the user, include:

- registered or scanned paths
- number of new, modified, and skipped sources
- generated note path
- skipped files and extraction reasons
- source-reference expectations for any summary text
customer-email-assistSkill

用 connector-first、最少 token 的方式审阅 Gmail 客户支持线程。适用于 Codex 需要通过 Codex Gmail connector 读取 Gmail、把清洗后的消息导入本地 SQLite 问题队列、先执行确定性的清洗和分类、只把模型调用保留给 JSON-only 的消息理解和草稿字段生成,并支持 dashboard 审阅、客户审批与排队回复处理的时候。

local-customer-email-replySkill

审阅客户来信,以本地政策或 FAQ 文档为依据起草安全的投诉和咨询回复。

agent-runtime-cache-benchmarkSkill

Compare two structured agent-run artifacts to estimate cache efficiency, explain likely cache breaks, and produce a local benchmark report. Use when a user wants to understand whether a prompt layout, tool manifest, or history shape is helping or hurting prompt-cache reuse.

daily-news-watcherSkill

Persistent daily news monitoring backed by local SQLite and Markdown reports. Use when a user asks Codex to track named publications, fetch the last N hours of news, summarize recent articles by topic, deduplicate articles across runs, or maintain a personal newsroom that survives across sessions.

garbage-collectorSkill

Scan local cleanup targets, apply readable cleanup rules, produce a preview report, and execute approved cleanup actions with logs. Use when a user asks Codex to clean up their computer, empty old Trash items, find duplicated Downloads files, review local storage clutter, or propose safe file cleanup actions before making changes.

local-document-organizerSkill

Preview-first local file organizer. Scan a user-named folder, classify files into category subfolders using readable rules, write a preview Markdown report and JSON plan, execute confirmed moves with persistent SQLite state, and reverse moves with undo. Use when a user asks Codex to organize Downloads, sort a messy folder into Invoices/Receipts/School/Images/Software/PDFs subfolders, propose a folder structure before moving anything, or undo a previous organization run.

price-watcherSkill

Persistent product price tracking for natural-language product requests. Use when a user asks to watch, track, monitor, compare, or report prices for a product, especially with a target price or threshold such as "Watch MacBook Pro M3 14-inch and tell me if it drops below $1200." Supports source discovery, Playwright/browser product checks, SQLite history, threshold comparison, and Markdown price reports.

prompt-cache-agent-harnessSkill

Plan and inspect prompt-cache behavior for long-running Claude agent loops. Use when a user wants to split stable tool, system, and history context into cacheable layers, compare captured cache metadata, estimate cost impact from supplied pricing inputs, or keep durable memory outside the cached prefix.