Skill342 repo starsupdated 2d ago

data-journalism

This Claude Code skill provides data journalism frameworks and methodologies for journalists and researchers analyzing datasets. It includes Philip Meyer's scientific approach to journalism, a structured data story arc from hook through implications, and a comprehensive methodology documentation template for transparency. Use when building data-driven narratives, structuring analysis for publication, documenting analysis processes for reader accountability, or training newsrooms in quantitative storytelling approaches.

View source Repository: claude-skills-journalism

Install in Claude Code

Copy

git clone --depth 1 https://github.com/jamditis/claude-skills-journalism /tmp/data-journalism && cp -r /tmp/data-journalism/journalism-core/skills/data-journalism ~/.claude/skills/data-journalism

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Data journalism methodology

Systematic approaches for finding, analyzing and presenting data in journalism.

## Story structure for data journalism

### Data journalism framework

The framework for data journalism was established by Philip Meyer, a journalist for Knight-Ridder, Harvard Nieman Fellow and professor at UNC-Chapel Hill. In his book *The New Precision Journalism*, Meyer encourages journalists to treat journalism "as if it were a science" by adopting the scientific method:

- Make observations / formulate a question
- Research the question / collect, store, and retrieve data
- Formulate a hypothesis
- Test the hypothesis, using both qualitative (interviews, documents) and quantitative (data analysis) methods
- Analyze the results and reduce them to the most important findings
- Present them to the audience

The process is iterative, not sequential.

### The data story arc

**1. The hook (nut graf)**
- What's the key finding?
- Why should readers care?
- What's the human impact?

**2. The evidence**
- Show the data
- Explain the methodology
- Acknowledge limitations

**3. The context**
- How does this compare to the past?
- How does this compare to elsewhere?
- What's the trend?

**4. The human element**
- Individual examples that illustrate the data
- Expert interpretation
- Affected voices

**5. The implications**
- What does this mean going forward?
- What questions remain?
- What actions could result?

**6. The methodology box**
- Where did the data come from?
- How was it analyzed?
- What are the limitations?
- How can readers explore further?

### Methodology documentation template

```markdown
## How we did this analysis

### Data sources
[List all data sources with links and access dates]

### Time period
[Specify exactly what time period is covered]

### Definitions
[Define key terms and how you operationalized them]

### Analysis steps
1. [First step of analysis]
2. [Second step]
3. [Continue...]

### Limitations
- [Limitation 1]
- [Limitation 2]

### What we excluded and why
- [Excluded category]: [Reason]

### Verification
[How findings were verified/checked]

### Code and data availability
[Link to GitHub repo if sharing code/data]

### Contact
[How readers can reach you with questions]
```

## Data acquisition

### Public data sources

**Federal data sources**

*General:*
- **Data.gov** — Federal open data portal. Many datasets were removed between Feb 2025 and 2026; consult the [Harvard LIL Data.gov archive](https://lil.law.harvard.edu/blog/2025/02/06/announcing-data-gov-archive/) and the [Data Rescue Project](https://www.datarescueproject.org/) for preserved copies before assuming anything is still accessible.
- **Census Bureau** (census.gov) — Demographics, economic data. Many research pages were removed during the 2025 transition; the [End of Term Web Archive](https://eotarchive.org) holds snapshots.
- **BLS** (bls.gov) — Employment, inflation, wages. Following the 2025 funding lapse, the October 2025 Employment Situation release was canceled and the CPS October 2025 reference period is permanently uncollected. Check [revised release dates](https://www.bls.gov/bls/2025-lapse-revised-release-dates.htm) before relying on series continuity.
- **BEA** (bea.gov) — GDP, economic accounts.
- **FRED / Federal Reserve** (fred.stlouisfed.org) — Financial and macroeconomic data; expanded API access through 2026.
- **SEC EDGAR** — Corporate filings.

*Specific domains:*
- **EPA** (epa.gov/data) — Environmental data. At least 80 climate webpages were removed in Dec 2025, the endangerment finding was repealed Feb 12, 2026, and the Climate Change Indicators site was largely gutted. The [Environmental Data & Governance Initiative](https://envirodatagov.org) maintains mirrors.
- **FDA / openFDA** (open.fda.gov) — Drug approvals, recalls, adverse events.
- **CDC WONDER** — Health statistics. Many datasets were removed from data.cdc.gov after Jan 2025, partially restored under Doctors for America v. Trump (TRO Feb 11, 2025) but with altered terminology in some returns. The volunteer-run [RestoredCDC.org](https://restoredcdc.org/wonder.cdc.gov/) mirrors removed content.
- **NHTSA FARS / vPIC APIs** — Vehicle safety data.
- **DOT** — Transportation statistics.
- **FEC** — Campaign finance; 2025-2026 cycle data live.
- **USASpending.gov** — Federal contracts and grants; API v2 operational.

*Court records:*
- **CourtListener / RECAP** (courtlistener.com) — Free PACER alternative covering federal court filings; RECAP Search Alerts launched June 2025 ("Google Alerts for federal courts").
- **PACER** — Federal court filings; $0.10 per page, $30 per quarter waiver threshold.

*State and local:*
- State open data portals (search: "[state] open data")
- Tyler Data & Insights (formerly Socrata, rebranded May 2025) hosts many city and state portals
- OpenStreetMap, municipal GIS portals
- State comptroller and auditor reports

*International:*
- **Eurostat**, **OECD**, **World Bank Open Data**, **UN Data** — major comparative datasets, mostly stable through 2026.

*Specialized:*
- **NICAR Data Library** (IRE) — curated datasets, IRE members only.
- **IPUMS** (University of Minnesota) — free with account; canonical for harmonized microdata.
- **ICPSR** (University of Michigan) — social-science data archive.
- **ProPublica Data Store** — frozen; datasets only run through 2023.

*Federal-data preservation (use when source data has been removed):*
- [Data Rescue Project](https://www.datarescueproject.org) — citizen + library mirrors of removed federal data; more than 1,230 datasets across 85 offices as of Aug 2025.
- [End of Term Web Archive](https://eotarchive.org) — 500TB / 100M-page snapshot of federal sites at the 2024-2025 transition.
- Internet Archive Wayback Machine — useful for individual page-level recovery.

### Data request strategies

**Public records requests for datasets**

For request mechanics (templates, fee-waiver language, NJ OPRA, appeals, FOIA Improvement Act statutor