Skip to main content
ClaudeWave
Skill1.2k estrellas del repoactualizado yesterday

tiktok_download

**tiktok_download** downloads individual or batch TikTok and Douyin videos via the TikHub API, extracting MP4 files and traffic metrics using only httpx, while optionally persisting raw API responses and generating structured CSV and JSON summaries through a separate post-processor. Use this skill when you need to bulk-fetch video content and metadata from TikTok or Douyin share links with parallel processing support, provided a valid TikHub API key is available.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/inclusionAI/AWorld /tmp/tiktok_download && cp -r /tmp/tiktok_download/aworld-skills/tiktok_download ~/.claude/skills/tiktok_download
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# TikHub Download Independent

## What this skill gives you

**Two small artifacts** (all copy-pasteable from this file):

1. **`tikhub_independent.py`** (single file, **`httpx` only**)

   - **Single video** or **batch** (parallel, max **10** workers): download MP4 + print metrics
   - **`Raw API JSON`**: **by default** every successful API response is written to disk as a **full** JSON file (same shape as e.g. `raw_api_response.json` in this repo — top-level `code`, `request_id`, `params`, `data.aweme_detail`, etc.). Use `--no-save-raw` to skip.
2. **`postprocess_tikhub_raw.py`** (**stdlib only**: `json`, `csv`, `argparse`, `glob`)

   - Reads one raw file **or** a directory of raw JSON files
   - Writes into the **current working directory** (or `--out-dir`):  
     - **`tikhub_videos_summary.csv`**  
     - **`tikhub_videos_summary.json`** (list of simplified records; one object per video)

**Structured-data reference** (field meanings, nesting): see project doc  
`TikHub_API_数据格式说明.md` (same repo, path from project root). Example raw payload: `raw_api_response.json`.

Does **not** import `TikHub-Multi-Functional-Downloader` or any other project package.

## Install

```bash
pip install httpx
```

Post-processor: **no extra packages**.

## API (for reference)

- TikTok: `GET https://api.tikhub.io/api/v1/tiktok/app/v3/fetch_one_video_by_share_url?share_url=...`
- Douyin: `GET https://api.tikhub.io/api/v1/douyin/app/v3/fetch_one_video_by_share_url?share_url=...`
- Header: `Authorization: Bearer <API_KEY>`

## Mandatory prerequisite: valid TikHub API key

This skill may be used only when a **valid TikHub API key** is available.

Agent rule:

1. First check whether this skill file already contains a usable API key value in the `API_KEY` default section.
2. If the key is missing, empty, placeholder-only, or otherwise unavailable, **stop the current task immediately**.
3. The agent must then tell the human user:
   - they need to go to [https://tikhub.io/](https://tikhub.io/) and apply for their own **paid personal API key**
   - then fill that key into this file: `/AWorld/aworld-skills/tikhub_download/SKILL.md`
4. Until the human has provided and filled in a valid key, do **not** continue with download, data fetching, retry, workaround, or fabricated-output behavior.

Concrete example:

- If this file no longer contains a real key in the relevant `API_KEY` location, treat the key as unavailable and follow the stop-and-notify rule above.

---

## Part A — `tikhub_independent.py` (download + optional raw JSON)

Save the following as `tikhub_independent.py`.

**Behavior note:** After each `fetch_video_info` call, if saving is enabled (default), the **entire** parsed JSON object is written with `json.dump(..., indent=2, ensure_ascii=False)` — this is the **audit / replay** artifact for downstream tooling, not the simplified extract.

```python
#!/usr/bin/env python3
"""
TikTok/Douyin: download MP4 + metrics via TikHub API. Optional: save full raw API JSON per request.

Requires: pip install httpx

Usage:
  python tikhub_independent.py one "https://www.tiktok.com/@user/video/123"
  python tikhub_independent.py one "URL" --no-save-raw
  python tikhub_independent.py batch urls.txt
  python tikhub_independent.py batch urls.txt --raw-dir my_raw_dir --max-workers 4

Raw JSON default directory (relative to current working directory): ./tikhub_api_raw
"""
from __future__ import annotations

import argparse
import concurrent.futures
import hashlib
import json
import os
import re
import sys
from typing import Any, Dict, List
from urllib.parse import urlparse

import httpx

API_KEY = os.getenv(
    "TIKHUB_API_KEY",
    "",
).strip()

MAX_WORKERS_CAP = 10
DEFAULT_OUT = os.path.expanduser("~/Downloads/tikhub_independent")
DEFAULT_RAW_DIR = "tikhub_api_raw"


def clean_name(name: str, max_len: int = 60) -> str:
    name = re.sub(r'[\\/:*?"<>|]+', "_", (name or "").strip())
    name = re.sub(r"\s+", " ", name).strip()
    return (name[:max_len] or "video").strip(" ._")


def platform_from_url(url: str) -> str:
    host = (urlparse(url).netloc or "").lower()
    if "douyin.com" in host:
        return "douyin"
    return "tiktok"


def fetch_video_info(api_key: str, share_url: str) -> dict:
    platform = platform_from_url(share_url)
    endpoint = f"https://api.tikhub.io/api/v1/{platform}/app/v3/fetch_one_video_by_share_url"
    headers = {"Authorization": f"Bearer {api_key}", "Accept": "*/*"}
    params = {"share_url": share_url}
    with httpx.Client(timeout=30.0, follow_redirects=True) as client:
        resp = client.get(endpoint, headers=headers, params=params)
        resp.raise_for_status()
        return resp.json()


def safe_raw_filename(raw: dict, share_url: str) -> str:
    data = raw.get("data") or {}
    detail = data.get("aweme_detail")
    if not detail and data.get("aweme_details"):
        detail = (data.get("aweme_details") or [None])[0]
    aid = (detail or {}).get("aweme_id") or "unknown"
    rid = (raw.get("request_id") or "noreq").replace("-", "")
    rid = rid[:16] if len(rid) > 16 else rid
    if aid == "unknown":
        h = hashlib.sha256(share_url.encode("utf-8")).hexdigest()[:10]
        return f"raw_unknown_{h}_{rid}.json"
    return f"raw_{aid}_{rid}.json"


def save_raw_json(raw: dict, share_url: str, raw_dir: str) -> str:
    os.makedirs(raw_dir, exist_ok=True)
    path = os.path.join(raw_dir, safe_raw_filename(raw, share_url))
    with open(path, "w", encoding="utf-8") as f:
        json.dump(raw, f, ensure_ascii=False, indent=2)
    return path


def extract_clean_data(raw: dict) -> dict:
    data = raw.get("data", {})
    detail = data.get("aweme_detail")
    if not detail and data.get("aweme_details"):
        detail = data["aweme_details"][0]
    if not detail:
        return {}

    video = detail.get("video", {})
    play = video.get("play_addr", {}) or {}
    url_list = play.get("url_list") or []
    video_url = url_list[0] if url_list else ""

    author = detail.
ad_image_createSkill

Create ad-ready product images (single or collage) by back-solving sub-image sizes from target output ratio, grounding scene design with media_comprehension, generating images via image_generator with strict request params and actor-count control, and pairing each deliverable with a short social tagline for 小红书/抖音.

ad_video_createSkill

Create ad-ready product video from product images, with or without character/subject images. The workflow leverages AI-powered image composition, scene understanding, and video generation. Video prompts should follow commercial shot language—visual hooks, product presence, hero shots, detail showcase, function expression, and dynamic visuals.

agent-browserSkill

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

app_evaluatorSkill

A professional skill for App Evaluation (evaluating app's performance with score) and App Improvement (giving professional suggestions for improving the app's performance).

embedded-video-pip-smooth-playbackSkill

>-

last_7_days_newsSkill

Search and summarize the latest 7 days of AI news and X discussions using public sources plus browser-based X collection. Use for recent AI news, trends, X discussions, industry briefs, and summaries organized into hot topics, viewpoints, and opportunity areas.

media_comprehensionSkill

An intelligent assistant specialized in handling media files (images/audio/video). **Only for media file analysis**, does not handle document types.\n\n✅ Media files that can be processed:\n- Images: .jpg, .jpeg, .png, .gif, .bmp, .webp, .svg\n- Audio: .mp3, .wav, .m4a, .flac, .aac, .ogg\n- Video: .mp4, .avi, .mov, .mkv, .webm, .flv\n\n❌ Files that cannot be processed (please do not trigger this skill):\n- Documents: .pdf, .doc, .docx, .txt, .md, .rtf\n- Spreadsheets: .xlsx, .xls, .csv, .tsv\n- Presentations: .pptx, .ppt, .key\n- Code: .py, .js, .ts, .java, .cpp, .go, .rs\n- Archives: .zip, .tar, .gz, .rar, .7z\n- Executables: .exe, .bin, .app, .dmg\n- Databases: .db, .sqlite, .sql\n- Configuration files: .json, .xml, .yaml, .yml, .toml, .ini\n- Web pages: .html, .htm, .css\n\n**Trigger conditions**: When the user explicitly requests to analyze image/audio/video content, or when the file extension belongs to the aforementioned media types.".

optimizerSkill

Analyzes and automatically optimizes existing agents by improving system prompts and tool configuration.