Skip to main content
ClaudeWave
Skill259 repo starsupdated 2d ago

page-monitoring

# Page-monitoring Page-monitoring is a Claude Code skill that provides methodologies and tools for detecting content changes, monitoring website availability, and preserving web pages before deletion. It compares services like Visualping, ChangeTower, Distill.io, and self-hosted solutions such as changedetection.io, offering guidance on free-tier limits, monitoring speeds, and retention windows. Use this skill when tracking content updates on specific pages, detecting when websites go offline, monitoring compliance pages, generating feeds for pages without RSS support, or preserving important content before removal.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/jamditis/claude-skills-journalism /tmp/page-monitoring && cp -r /tmp/page-monitoring/research-toolkit/skills/page-monitoring ~/.claude/skills/page-monitoring
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Page monitoring methodology

Patterns for tracking web page changes, detecting content removal, and preserving important pages before they disappear.

## Monitoring service comparison

Free-tier limits and retention windows shift annually — verify at the
service's pricing page before relying on a specific number. The
columns below reflect a 2026 snapshot.

| Service | Free Tier | Best For | History | Alert Speed |
|---------|-----------|----------|---------|-------------|
| **Visualping** | A few daily checks (free plan tightened in recent years) | Visual changes | Standard | Minutes |
| **ChangeTower** | Yes (verify current limits) | Compliance, archiving | Multi-year on paid plans | Minutes |
| **Distill.io** | ~5 monitors with 7-day history | Element-level tracking | Limited on free tier | Seconds |
| **Wachete** | Limited | Login-protected pages | 12 months | Minutes |
| **UptimeRobot** | 50 monitors at 5-minute intervals (free SMS removed) | Uptime only | 60 days | 5-min checks |
| **changedetection.io** | Self-hosted; free | Privacy / DIY | Disk space | Configurable |
| **urlwatch** | Self-hosted; free | Cron-driven CLI | Configurable | Configurable |

## Quick-start: Monitor a page

### Distill.io element monitoring

```javascript
// Distill.io allows CSS/XPath selectors for precise monitoring
// Example selectors for common use cases:

// Monitor news article headlines
const newsSelector = '.article-headline, h1.title, .story-title';

// Monitor price changes
const priceSelector = '.price, .product-price, [data-price]';

// Monitor stock/availability
const availabilitySelector = '.in-stock, .availability, .stock-status';

// Monitor specific paragraph or section
const sectionSelector = '#main-content p:first-child';

// Monitor table data
const tableSelector = 'table.data-table tbody tr';
```

### Python monitoring script

```python
import requests
import hashlib
import json
import smtplib
from email.mime.text import MIMEText
from datetime import datetime
from pathlib import Path
from typing import Optional
from bs4 import BeautifulSoup

class PageMonitor:
    """Simple page change monitor with local storage."""

    def __init__(self, storage_dir: Path):
        self.storage_dir = storage_dir
        self.storage_dir.mkdir(parents=True, exist_ok=True)
        self.state_file = storage_dir / 'monitor_state.json'
        self.state = self._load_state()

    def _load_state(self) -> dict:
        if self.state_file.exists():
            return json.loads(self.state_file.read_text())
        return {'pages': {}}

    def _save_state(self):
        self.state_file.write_text(json.dumps(self.state, indent=2))

    def _get_page_hash(self, url: str, selector: Optional[str] = None) -> tuple[str, str]:
        """Get content hash and content for a page or element."""

        response = requests.get(url, timeout=30, headers={
            'User-Agent': 'Mozilla/5.0 (PageMonitor/1.0)'
        })
        response.raise_for_status()

        if selector:
            soup = BeautifulSoup(response.text, 'html.parser')
            element = soup.select_one(selector)
            content = element.get_text(strip=True) if element else ''
        else:
            content = response.text

        content_hash = hashlib.sha256(content.encode()).hexdigest()
        return content_hash, content

    def add_page(self, url: str, name: str, selector: Optional[str] = None):
        """Add a page to monitor."""

        content_hash, content = self._get_page_hash(url, selector)

        self.state['pages'][url] = {
            'name': name,
            'selector': selector,
            'last_hash': content_hash,
            'last_check': datetime.now().isoformat(),
            'last_content': content[:1000],  # Store preview
            'change_count': 0
        }

        self._save_state()
        print(f"Added: {name} ({url})")

    def check_page(self, url: str) -> Optional[dict]:
        """Check single page for changes."""

        if url not in self.state['pages']:
            return None

        page = self.state['pages'][url]
        selector = page.get('selector')

        try:
            new_hash, new_content = self._get_page_hash(url, selector)
        except Exception as e:
            return {
                'url': url,
                'name': page['name'],
                'status': 'error',
                'error': str(e)
            }

        changed = new_hash != page['last_hash']

        result = {
            'url': url,
            'name': page['name'],
            'status': 'changed' if changed else 'unchanged',
            'previous_content': page['last_content'],
            'new_content': new_content[:1000] if changed else None
        }

        if changed:
            page['last_hash'] = new_hash
            page['last_content'] = new_content[:1000]
            page['change_count'] += 1

            # Archive the change
            archive_file = self.storage_dir / f"{hashlib.md5(url.encode()).hexdigest()}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt"
            archive_file.write_text(new_content)

        page['last_check'] = datetime.now().isoformat()
        self._save_state()

        return result

    def check_all(self) -> list[dict]:
        """Check all monitored pages."""
        results = []
        for url in self.state['pages']:
            result = self.check_page(url)
            if result:
                results.append(result)
        return results

# Usage
monitor = PageMonitor(Path('./page_monitor_data'))

# Add pages to monitor
monitor.add_page(
    'https://example.com/important-page',
    'Important Page',
    selector='.main-content'  # Optional: monitor specific element
)

# Check for changes
results = monitor.check_all()
for result in results:
    if result['status'] == 'changed':
        print(f"CHANGED: {result['name']}")
        print(f"  Previous: {result['previous_content'][:100]}...")
        print(f"  New: {result['new_content'][:100]}
accessibility-complianceSkill

Web accessibility patterns for news sites, journalism tools, and academic platforms. Use when building accessible interfaces, auditing existing sites for WCAG compliance, writing alt text for news images, creating accessible data visualizations, or ensuring content reaches all readers including those using assistive technologies. Essential for newsroom developers and anyone publishing web content.

electron-devSkill

Electron desktop application development with React, TypeScript, and Vite. Use when building desktop apps, implementing IPC communication, managing windows/tray, handling PTY terminals, integrating WebRTC/audio, or packaging with electron-builder. Covers patterns from AudioBash, Yap, and Pisscord projects.

mobile-debuggingSkill

Remote JavaScript console access and debugging on mobile devices. Use when debugging web pages on phones/tablets, accessing console errors without desktop DevTools, testing responsive designs on real devices, or diagnosing mobile-specific issues. Covers Eruda, vConsole, Chrome/Safari remote debugging, and cloud testing platforms.

one-way-doorSkill

Use this skill when creating new files that represent architectural decisions — data models, infrastructure configs, auth boundaries, API contracts, CI/CD pipelines, or event systems. Flags irreversible decisions and forces a discussion about trade-offs before committing.

python-pipelineSkill

Python data processing pipelines with modular architecture. Use when building content processing workflows, implementing dispatcher patterns, integrating Google Sheets/Drive APIs, or creating batch processing systems. Covers patterns from rosen-scraper, image-analyzer, and social-scraper projects.

test-first-bugsSkill

This skill should be used when the user reports a bug, describes unexpected behavior, says something is "broken", "not working", "failing", mentions an "error", "issue", or "problem" in code, or asks to "fix" something. Enforces test-driven bug fixing workflow.

vibe-codingSkill

Methodology for effective AI-assisted software development. Use when helping users build software with AI coding assistants, debugging AI-generated code, planning features for AI implementation, managing version control in AI workflows, or when users mention "vibe coding," Claude Code, Cursor, GitHub Copilot, Aider, Continue, Cline, Codex, Windsurf, or similar AI coding tools. Provides strategies for planning, testing, debugging, and iterating on code written with LLM assistance.

web-scrapingSkill

Web scraping with anti-bot bypass, content extraction, undocumented APIs and poison pill detection. Use when extracting content from websites, handling paywalls, implementing scraping cascades or processing social media. Covers requests, trafilatura, Playwright with stealth mode, yt-dlp and instaloader patterns.