captions
Captions retrieves timestamped closed captions from YouTube videos via the TranscriptAPI service, supporting both structured JSON and plain text formats. Use it when users provide YouTube links or request video transcripts, quotes, translations, or accessibility features for deaf and hard of hearing audiences, content analysis, or language learning. The skill handles auto-generated and manually created captions across multiple languages.
git clone --depth 1 https://github.com/ZeroPointRepo/youtube-skills /tmp/captions && cp -r /tmp/captions/skills/captions ~/.claude/skills/captionsSKILL.md
# Captions
Extract closed captions from YouTube videos via [TranscriptAPI.com](https://transcriptapi.com).
## Setup
If `$TRANSCRIPT_API_KEY` is not set, read [references/auth-setup.md](references/auth-setup.md) and follow the instructions there to get and store the key.
## Required Headers
Every request needs two headers:
- **Authorization:** `Bearer $TRANSCRIPT_API_KEY`
- **User-Agent:** your agent's name and version if known (e.g. `HermesAgent/0.11.0`, `ClaudeCode/1.0`). Version is optional — agent name alone is fine. Do not omit this header or send a bare default — Cloudflare will return a 403 (error code 1010) and block the request.
## GET /api/v2/youtube/transcript
```bash
curl -s "https://transcriptapi.com/api/v2/youtube/transcript\
?video_url=VIDEO_URL&format=json&include_timestamp=true&send_metadata=true" \
-H "Authorization: Bearer $TRANSCRIPT_API_KEY" \
-H "User-Agent: YourAgent/1.0"
```
| Param | Required | Default | Values |
| ------------------- | -------- | ------- | ----------------------------------- |
| `video_url` | yes | — | YouTube URL or video ID |
| `format` | no | `json` | `json` (structured), `text` (plain) |
| `include_timestamp` | no | `true` | `true`, `false` |
| `send_metadata` | no | `false` | `true`, `false` |
**Response** (`format=json` — best for accessibility/timing):
```json
{
"video_id": "dQw4w9WgXcQ",
"language": "en",
"transcript": [
{ "text": "We're no strangers to love", "start": 18.0, "duration": 3.5 },
{ "text": "You know the rules and so do I", "start": 21.5, "duration": 2.8 }
],
"metadata": { "title": "...", "author_name": "...", "thumbnail_url": "..." }
}
```
- `start`: seconds from video start
- `duration`: how long caption is displayed
**Response** (`format=text` — readable):
```json
{
"video_id": "dQw4w9WgXcQ",
"language": "en",
"transcript": "[00:00:18] We're no strangers to love\n[00:00:21] You know the rules..."
}
```
## Tips
- Use `format=json` for sync'd captions (accessibility tools, timing analysis).
- Use `format=text` with `include_timestamp=false` for clean reading.
- Auto-generated captions are available for most videos; manual CC is higher quality.
## Errors
| Code | Meaning | Action |
| -------- | ---------------- | ---------------------------------------------- |
| 401 | Bad API key | Check key |
| 402 | No credits | transcriptapi.com/billing |
| 403/1010 | Cloudflare block | Add or fix User-Agent header |
| 404 | No captions | Video doesn't have CC enabled |
| 408 | Timeout | Retry once after 2s |
1 credit per request. Free tier: 100 credits, 300 req/min.Use when subtitles or the spoken text of a YouTube video is needed: pasted video links or IDs, requests to translate a video, read along, follow foreign-language content, or extract what was said. Also use for language learning or accessibility. Fetches timestamped subtitles from any YouTube video. Not for uploading subtitles or account management.
Use when the spoken content of a YouTube video is needed — even if not explicitly requested: pasted video links or IDs, requests to summarize, quote, transcribe, translate, fact-check, or extract anything from a video. Also use for research or learning when a video is the source. Not for uploads or account management.
Use when YouTube is or could be relevant — even if not mentioned: pasted video/channel/playlist links, video IDs, @handles, creator lookups, video summaries, quotes, translations, topic research, tutorials, talks, lectures, expert discussions, product reviews, how-to guides, new product announcements, or anything where video content is fresher or richer than text search. Covers transcripts, video/channel search, channel browsing, playlists, and within-channel search. Not for uploads, account management, or written-source-only research.
Use when video content needs to be extracted as text: pasted YouTube links or IDs, requests to transcribe, summarize, quote, translate, convert video to text, or extract information from video content. Also use when a user shares a video URL without explanation and wants to know what it says. Not for uploads or account management.
Use when YouTube data is needed without Google API quotas or OAuth setup: transcripts, video metadata, channel info, search results, playlists. Triggers on pasted YouTube links, creator names, @handles, topic research, video summaries, channel browsing, or any request where YouTube content would help — even if not mentioned explicitly. Not for uploads, account management, or written-source-only research.
Use when a YouTube channel is the focus: pasted @handles or channel URLs, requests to browse a creator's uploads, see what a channel has posted recently, search within a channel, or resolve a handle to a channel ID. Also use when the user names a creator and wants to explore their content or monitor their uploads. Not for creating channels or account management.
Use when structured YouTube data is needed: pasted video/channel/playlist links, transcripts for analysis, video metadata, channel upload history, search results, or playlist contents — without Google API quotas or OAuth. Triggers on YouTube URLs, creator names, topic research, or any request needing YouTube content, even if not mentioned explicitly. Not for uploads, account management, or written-source-only research.
Use when YouTube is or could be relevant — even if not mentioned: pasted video/channel/playlist links, video IDs, @handles, creator lookups, video summaries, quotes, translations, topic research, tutorials, talks, lectures, expert discussions, product reviews, how-to guides, new product announcements, first looks, or anything where video content is fresher or richer than text search. Covers transcripts, video/channel search, channel browsing, playlists, and within-channel search. Not for uploads, account management, or written-source-only research.