daily-news-watcher
Daily News Watcher maintains a persistent SQLite database of news sources and generates deduplicated Markdown summaries of recent articles from RSS feeds and web pages. Use it when tracking specific publications over time, fetching articles from the last N hours, organizing news by topic, or maintaining a searchable archive that persists across sessions.
git clone --depth 1 https://github.com/Prompthon-IO/agent-systems-handbook /tmp/daily-news-watcher && cp -r /tmp/daily-news-watcher/skills/daily-news-watcher ~/.claude/skills/daily-news-watcherSKILL.md
# Daily News Watcher
For a student-facing explanation of why this package exists and how the
end-to-end workflow fits into the handbook, read `README.md` first. This file
is the invocation contract for Codex.
## Overview
Use this skill to operate a persistent personal news watcher. The skill keeps
a list of named sources in SQLite, fetches recent articles from RSS/Atom feeds
(with an optional Playwright fallback for pages without usable feeds),
deduplicates results by canonical URL and content hash, summarizes them, and
writes a Markdown report per run.
The skill has two distinct workflows: **add sources** and **fetch and
summarize**. Treat them as separate user intents and surface them as separate
commands.
## Safety Rules
- Only register and fetch public sources. Refuse `file://`, internal hostnames,
and anything that requires auth headers, cookies, or login flows.
- Never bypass paywalls, captchas, or other access controls.
- Never store browsing credentials, session cookies, or auth tokens.
- One failing source must not abort the whole run. Record the failure on the
`runs` row and continue with the next source.
- The runtime database, logs, and generated reports live under
`~/.codex/state/daily-news-watcher/` and stay out of git unless the user
explicitly asks to commit example artifacts.
- Honor the readable rules in `references/fetch-rules.md`.
## Workflow A - Add Sources
Trigger when the user names publications to track ("add BBC AI, The Verge AI,
and OpenAI Blog to my daily news watcher").
1. For each name, resolve to a URL using `references/known-sources.csv`. If
the publication is not known, ask the user for an explicit `--url`.
2. Validate that the URL is a public `http(s)` URL.
3. Probe reachability with a short HTTP GET. Surface unreachable sources as a
warning but still allow registration if the user insists.
4. Insert the source into `sources` with tags. Existing rows with the same
name are updated rather than duplicated.
5. Echo the resolved URL, type, and tags so the user can confirm.
## Workflow B - Fetch And Summarize
Trigger when the user asks for a daily digest ("fetch the last 24 hours of AI
news").
1. Read all rows from `sources`.
2. For each source, fetch via RSS/Atom first. If the response is not a feed
and `--use-playwright` is set, fall back to a Playwright render.
3. Normalize each article (canonical URL, stripped HTML summary, parsed
`published_at`) and skip duplicates by URL or content hash.
4. Apply `--hours` and `--topic` filters.
5. Insert kept articles into `articles` and stamp the source with
`last_checked_at`.
6. Write a Markdown report to
`~/.codex/state/daily-news-watcher/reports/daily-news/YYYY-MM-DD-<topic>.md`
that lists sources checked, articles included, summaries, links, and any
skipped or error notes.
7. Update the `runs` row with `finished_at` and a status of `ok`, `partial`,
`all_sources_failed`, or `no_sources`.
## Commands
Resolve `scripts/daily_news_watcher.py` relative to this skill directory. When
running from an installed Codex copy, that is usually
`~/.codex/skills/daily-news-watcher/scripts/daily_news_watcher.py`.
Add a known publication:
```bash
python3 scripts/daily_news_watcher.py add-source --name "BBC AI"
```
Add a custom source by URL:
```bash
python3 scripts/daily_news_watcher.py add-source \
--name "Example AI Blog" \
--url "https://example.com/feed.xml" \
--tags "AI;research"
```
List or remove sources:
```bash
python3 scripts/daily_news_watcher.py list-sources
python3 scripts/daily_news_watcher.py remove-source --id 3
```
Fetch the last 24 hours of AI news:
```bash
python3 scripts/daily_news_watcher.py fetch --hours 24 --topic AI
```
Fetch with the optional Playwright fallback enabled:
```bash
python3 scripts/daily_news_watcher.py fetch --hours 24 --topic AI --use-playwright
```
Show recent runs:
```bash
python3 scripts/daily_news_watcher.py runs --limit 10
```
## Persistence
The SQLite database lives at:
```text
~/.codex/state/daily-news-watcher/news.sqlite
```
Schema:
```sql
sources(id, name, url, type, tags, created_at, last_checked_at)
articles(id, source_id, title, url, published_at, fetched_at, summary, hash)
runs(id, topic, started_at, finished_at, status)
```
`articles.url` and `articles.hash` are unique, so reruns naturally deduplicate
across sessions.
## Outputs
```text
~/.codex/state/daily-news-watcher/
news.sqlite
reports/daily-news/YYYY-MM-DD-<topic>.md
logs/<run_id>-fetch.json
```
Do not commit runtime databases, logs, or reports unless the user explicitly
asks for sample artifacts.
## Response Pattern
When reporting fetch results to the user, include:
- run id and status
- number of sources checked, with how many had errors
- number of articles included after dedupe and filtering
- the report path
- a short rewritten summary of the most relevant articles (Codex should
rewrite the deterministic snippets into readable prose)
- any source-level errors that should be retried later
When reporting source-management results, include the resolved URL, the
inferred type, and any reachability warning.用 connector-first、最少 token 的方式审阅 Gmail 客户支持线程。适用于 Codex 需要通过 Codex Gmail connector 读取 Gmail、把清洗后的消息导入本地 SQLite 问题队列、先执行确定性的清洗和分类、只把模型调用保留给 JSON-only 的消息理解和草稿字段生成,并支持 dashboard 审阅、客户审批与排队回复处理的时候。
审阅客户来信,以本地政策或 FAQ 文档为依据起草安全的投诉和咨询回复。
Compare two structured agent-run artifacts to estimate cache efficiency, explain likely cache breaks, and produce a local benchmark report. Use when a user wants to understand whether a prompt layout, tool manifest, or history shape is helping or hurting prompt-cache reuse.
Scan local cleanup targets, apply readable cleanup rules, produce a preview report, and execute approved cleanup actions with logs. Use when a user asks Codex to clean up their computer, empty old Trash items, find duplicated Downloads files, review local storage clutter, or propose safe file cleanup actions before making changes.
Preview-first local file organizer. Scan a user-named folder, classify files into category subfolders using readable rules, write a preview Markdown report and JSON plan, execute confirmed moves with persistent SQLite state, and reverse moves with undo. Use when a user asks Codex to organize Downloads, sort a messy folder into Invoices/Receipts/School/Images/Software/PDFs subfolders, propose a folder structure before moving anything, or undo a previous organization run.
Capture local or explicitly provided web knowledge sources into cited Markdown notes. Use when a user asks Codex to watch a research folder, register local folders for later scans, summarize new or modified local Markdown/TXT/PDF/DOCX files, capture a provided URL, maintain SQLite state for personal knowledge capture, or generate searchable source-grounded daily notes.
Persistent product price tracking for natural-language product requests. Use when a user asks to watch, track, monitor, compare, or report prices for a product, especially with a target price or threshold such as "Watch MacBook Pro M3 14-inch and tell me if it drops below $1200." Supports source discovery, Playwright/browser product checks, SQLite history, threshold comparison, and Markdown price reports.
Plan and inspect prompt-cache behavior for long-running Claude agent loops. Use when a user wants to split stable tool, system, and history context into cacheable layers, compare captured cache metadata, estimate cost impact from supplied pricing inputs, or keep durable memory outside the cached prefix.