Skip to main content
ClaudeWave
Skill259 repo starsupdated 2d ago

data-journalism

This Claude Code skill provides data journalism frameworks and methodologies for journalists and researchers analyzing datasets. It includes Philip Meyer's scientific approach to journalism, a structured data story arc from hook through implications, and a comprehensive methodology documentation template for transparency. Use when building data-driven narratives, structuring analysis for publication, documenting analysis processes for reader accountability, or training newsrooms in quantitative storytelling approaches.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/jamditis/claude-skills-journalism /tmp/data-journalism && cp -r /tmp/data-journalism/journalism-core/skills/data-journalism ~/.claude/skills/data-journalism
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Data journalism methodology

Systematic approaches for finding, analyzing and presenting data in journalism.

## Story structure for data journalism

### Data journalism framework

The framework for data journalism was established by Philip Meyer, a journalist for Knight-Ridder, Harvard Nieman Fellow and professor at UNC-Chapel Hill. In his book *The New Precision Journalism*, Meyer encourages journalists to treat journalism "as if it were a science" by adopting the scientific method:

- Make observations / formulate a question
- Research the question / collect, store, and retrieve data
- Formulate a hypothesis
- Test the hypothesis, using both qualitative (interviews, documents) and quantitative (data analysis) methods
- Analyze the results and reduce them to the most important findings
- Present them to the audience

The process is iterative, not sequential.

### The data story arc

**1. The hook (nut graf)**
- What's the key finding?
- Why should readers care?
- What's the human impact?

**2. The evidence**
- Show the data
- Explain the methodology
- Acknowledge limitations

**3. The context**
- How does this compare to the past?
- How does this compare to elsewhere?
- What's the trend?

**4. The human element**
- Individual examples that illustrate the data
- Expert interpretation
- Affected voices

**5. The implications**
- What does this mean going forward?
- What questions remain?
- What actions could result?

**6. The methodology box**
- Where did the data come from?
- How was it analyzed?
- What are the limitations?
- How can readers explore further?

### Methodology documentation template

```markdown
## How we did this analysis

### Data sources
[List all data sources with links and access dates]

### Time period
[Specify exactly what time period is covered]

### Definitions
[Define key terms and how you operationalized them]

### Analysis steps
1. [First step of analysis]
2. [Second step]
3. [Continue...]

### Limitations
- [Limitation 1]
- [Limitation 2]

### What we excluded and why
- [Excluded category]: [Reason]

### Verification
[How findings were verified/checked]

### Code and data availability
[Link to GitHub repo if sharing code/data]

### Contact
[How readers can reach you with questions]
```

## Data acquisition

### Public data sources

**Federal data sources**

*General:*
- **Data.gov** — Federal open data portal. Many datasets were removed between Feb 2025 and 2026; consult the [Harvard LIL Data.gov archive](https://lil.law.harvard.edu/blog/2025/02/06/announcing-data-gov-archive/) and the [Data Rescue Project](https://www.datarescueproject.org/) for preserved copies before assuming anything is still accessible.
- **Census Bureau** (census.gov) — Demographics, economic data. Many research pages were removed during the 2025 transition; the [End of Term Web Archive](https://eotarchive.org) holds snapshots.
- **BLS** (bls.gov) — Employment, inflation, wages. Following the 2025 funding lapse, the October 2025 Employment Situation release was canceled and the CPS October 2025 reference period is permanently uncollected. Check [revised release dates](https://www.bls.gov/bls/2025-lapse-revised-release-dates.htm) before relying on series continuity.
- **BEA** (bea.gov) — GDP, economic accounts.
- **FRED / Federal Reserve** (fred.stlouisfed.org) — Financial and macroeconomic data; expanded API access through 2026.
- **SEC EDGAR** — Corporate filings.

*Specific domains:*
- **EPA** (epa.gov/data) — Environmental data. At least 80 climate webpages were removed in Dec 2025, the endangerment finding was repealed Feb 12, 2026, and the Climate Change Indicators site was largely gutted. The [Environmental Data & Governance Initiative](https://envirodatagov.org) maintains mirrors.
- **FDA / openFDA** (open.fda.gov) — Drug approvals, recalls, adverse events.
- **CDC WONDER** — Health statistics. Many datasets were removed from data.cdc.gov after Jan 2025, partially restored under Doctors for America v. Trump (TRO Feb 11, 2025) but with altered terminology in some returns. The volunteer-run [RestoredCDC.org](https://restoredcdc.org/wonder.cdc.gov/) mirrors removed content.
- **NHTSA FARS / vPIC APIs** — Vehicle safety data.
- **DOT** — Transportation statistics.
- **FEC** — Campaign finance; 2025-2026 cycle data live.
- **USASpending.gov** — Federal contracts and grants; API v2 operational.

*Court records:*
- **CourtListener / RECAP** (courtlistener.com) — Free PACER alternative covering federal court filings; RECAP Search Alerts launched June 2025 ("Google Alerts for federal courts").
- **PACER** — Federal court filings; $0.10 per page, $30 per quarter waiver threshold.

*State and local:*
- State open data portals (search: "[state] open data")
- Tyler Data & Insights (formerly Socrata, rebranded May 2025) hosts many city and state portals
- OpenStreetMap, municipal GIS portals
- State comptroller and auditor reports

*International:*
- **Eurostat**, **OECD**, **World Bank Open Data**, **UN Data** — major comparative datasets, mostly stable through 2026.

*Specialized:*
- **NICAR Data Library** (IRE) — curated datasets, IRE members only.
- **IPUMS** (University of Minnesota) — free with account; canonical for harmonized microdata.
- **ICPSR** (University of Michigan) — social-science data archive.
- **ProPublica Data Store** — frozen; datasets only run through 2023.

*Federal-data preservation (use when source data has been removed):*
- [Data Rescue Project](https://www.datarescueproject.org) — citizen + library mirrors of removed federal data; more than 1,230 datasets across 85 offices as of Aug 2025.
- [End of Term Web Archive](https://eotarchive.org) — 500TB / 100M-page snapshot of federal sites at the 2024-2025 transition.
- Internet Archive Wayback Machine — useful for individual page-level recovery.

### Data request strategies

**Public records requests for datasets**

For request mechanics (templates, fee-waiver language, NJ OPRA, appeals, FOIA Improvement Act statutor
accessibility-complianceSkill

Web accessibility patterns for news sites, journalism tools, and academic platforms. Use when building accessible interfaces, auditing existing sites for WCAG compliance, writing alt text for news images, creating accessible data visualizations, or ensuring content reaches all readers including those using assistive technologies. Essential for newsroom developers and anyone publishing web content.

electron-devSkill

Electron desktop application development with React, TypeScript, and Vite. Use when building desktop apps, implementing IPC communication, managing windows/tray, handling PTY terminals, integrating WebRTC/audio, or packaging with electron-builder. Covers patterns from AudioBash, Yap, and Pisscord projects.

mobile-debuggingSkill

Remote JavaScript console access and debugging on mobile devices. Use when debugging web pages on phones/tablets, accessing console errors without desktop DevTools, testing responsive designs on real devices, or diagnosing mobile-specific issues. Covers Eruda, vConsole, Chrome/Safari remote debugging, and cloud testing platforms.

one-way-doorSkill

Use this skill when creating new files that represent architectural decisions — data models, infrastructure configs, auth boundaries, API contracts, CI/CD pipelines, or event systems. Flags irreversible decisions and forces a discussion about trade-offs before committing.

python-pipelineSkill

Python data processing pipelines with modular architecture. Use when building content processing workflows, implementing dispatcher patterns, integrating Google Sheets/Drive APIs, or creating batch processing systems. Covers patterns from rosen-scraper, image-analyzer, and social-scraper projects.

test-first-bugsSkill

This skill should be used when the user reports a bug, describes unexpected behavior, says something is "broken", "not working", "failing", mentions an "error", "issue", or "problem" in code, or asks to "fix" something. Enforces test-driven bug fixing workflow.

vibe-codingSkill

Methodology for effective AI-assisted software development. Use when helping users build software with AI coding assistants, debugging AI-generated code, planning features for AI implementation, managing version control in AI workflows, or when users mention "vibe coding," Claude Code, Cursor, GitHub Copilot, Aider, Continue, Cline, Codex, Windsurf, or similar AI coding tools. Provides strategies for planning, testing, debugging, and iterating on code written with LLM assistance.

web-scrapingSkill

Web scraping with anti-bot bypass, content extraction, undocumented APIs and poison pill detection. Use when extracting content from websites, handling paywalls, implementing scraping cascades or processing social media. Covers requests, trafilatura, Playwright with stealth mode, yt-dlp and instaloader patterns.