Skip to main content
ClaudeWave
Skill1.2k estrellas del repoactualizado yesterday

tikhub-xiaohongshu-search

This skill provides lightweight Xiaohongshu image search via TikHub's API, optimized for single-request usage with curl or minimal Python. It saves raw API JSON responses and includes an optional post-processor to convert results into CSV or simplified JSON formats. Use this when you need straightforward keyword-based image searches, page-based pagination, or structured note and image metadata without heavy wrapper dependencies.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/inclusionAI/AWorld /tmp/tikhub-xiaohongshu-search && cp -r /tmp/tikhub-xiaohongshu-search/aworld-skills/xiaohongshu_search ~/.claude/skills/tikhub-xiaohongshu-search
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# TikHub Xiaohongshu Search

## What this skill gives you

This skill is optimized for the **common case: one keyword search request**.

It provides:

1. **Minimal request patterns**
   - `curl` for quickest validation
   - tiny `httpx` example for people who prefer Python

2. **Raw JSON saving**
   - save the full TikHub response after each request
   - useful for audit, replay, and later post-processing

3. **One optional post-processor**
   - `postprocess_xiaohongshu_raw.py`
   - reads one raw file or a directory of raw files
   - writes `xiaohongshu_search_summary.csv` and `xiaohongshu_search_summary.json`

4. **Optional pagination guidance**
   - enough information for later page turning
   - intentionally brief, not the main path

Does **not** import `TikHub-Multi-Functional-Downloader` or any other project package.

## API key requirement

This skill intentionally does **not** contain any API key.

Use one of these:

- environment variable: `TIKHUB_API_KEY`
- ask the user to provide an API key explicitly

If the key is missing, stop and ask for it instead of hardcoding one into scripts.

## Install

```bash
pip install httpx
```

Post-processor: **no extra packages**.

## API (for reference)

- Image search:
  `GET https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=...&page=1&source=explore_feed`
- Header:
  `Authorization: Bearer <API_KEY>`

## Notes from real requests

- In `curl`, Chinese keywords should be URL-encoded. Directly putting `壁纸` into the query caused `400`, while `%E5%A3%81%E7%BA%B8` succeeded.
- A working minimal first-page request was:
  `keyword=%E5%A3%81%E7%BA%B8&page=1&source=explore_feed`
- The first-page response returns pagination context:
  `search_id`, `search_session_id`, `word_request_id`, and `next_page`
- Search results are in:
  `data.data.items`
- Useful nested sections include:
  `image_info`, `note_info`, `share_info`, and `user_info`

---

## Preferred path: single request

### 1. Quickest: `curl`

First page:

```bash
curl --location --request GET "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=%E5%A3%81%E7%BA%B8&page=1&source=explore_feed" \
--header "Authorization: Bearer $TIKHUB_API_KEY"
```

Another keyword example:

```bash
curl --location --request GET "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=%E6%B2%BB%E6%84%88%E7%B3%BB&page=1&source=explore_feed" \
--header "Authorization: Bearer $TIKHUB_API_KEY"
```

### 2. Preferred Python pattern: tiny `httpx`

If the user wants Python, prefer a **small request snippet**, not a framework.

Search and save raw JSON:

```python
import json
import os
import urllib.parse
import httpx

api_key = os.getenv("TIKHUB_API_KEY", "").strip()
if not api_key:
    raise SystemExit("Missing TIKHUB_API_KEY")

keyword = "壁纸"
url = "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images"
params = {
    "keyword": keyword,
    "page": 1,
    "source": "explore_feed",
}
headers = {"Authorization": f"Bearer {api_key}", "Accept": "*/*"}

with httpx.Client(timeout=30.0, follow_redirects=True) as client:
    raw = client.get(url, params=params, headers=headers).json()

safe_keyword = urllib.parse.quote(keyword, safe="")
with open(f"xiaohongshu_search_{safe_keyword}.json", "w", encoding="utf-8") as f:
    json.dump(raw, f, ensure_ascii=False, indent=2)

items = raw.get("data", {}).get("data", {}).get("items", [])
for item in items[:5]:
    note = item.get("note_info", {})
    share = item.get("share_info", {})
    user = item.get("user_info", {})
    print(note.get("title", ""))
    print(share.get("link", ""))
    print(user.get("nickname", ""))
```

---

## Save raw JSON by default

For this workflow, the recommended default is:

1. request the API
2. save the **full** raw JSON immediately
3. print only a few useful fields for quick inspection
4. optionally run the post-processor later

Suggested file naming:

- first page raw: `search_<keyword>_page1_<request_id>.json`
- next page raw: `search_<keyword>_page2_<request_id>.json`

If `request_id` is unavailable, hash the keyword plus page number.

---

## Pagination

Only care about this if the user wants page 2 or beyond.

From the first response, keep these fields:

- `search_id`
- `search_session_id`
- `word_request_id`
- `next_page`

Then use them in the next request:

```bash
curl --location --request GET "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=%E5%A3%81%E7%BA%B8&page=2&search_id=<search_id>&search_session_id=<search_session_id>&word_request_id=<word_request_id>&source=explore_feed" \
--header "Authorization: Bearer $TIKHUB_API_KEY"
```

If the endpoint behavior changes, trust the latest response fields over assumptions.

---

## Post-process raw JSON

Save as `postprocess_xiaohongshu_raw.py` (**stdlib only**).

Input:

- one raw search JSON file
- or a directory containing multiple raw JSON files

Output:

- `xiaohongshu_search_summary.csv`
- `xiaohongshu_search_summary.json`

```python
#!/usr/bin/env python3
from __future__ import annotations

import argparse
import csv
import json
import os
import sys
from glob import glob
from typing import Any, Dict, List


def collect_inputs(path: str) -> List[str]:
    if os.path.isfile(path):
        return [path]
    if os.path.isdir(path):
        return sorted(glob(os.path.join(path, "*.json")))
    raise FileNotFoundError(path)


def as_list(value: Any) -> List[dict]:
    return value if isinstance(value, list) else []


def flatten_for_csv(row: Dict[str, Any]) -> Dict[str, Any]:
    out: Dict[str, Any] = {}
    for k, v in row.items():
        if v is None:
            out[k] = ""
        elif isinstance(v, (dict, list)):
            out[k] = json.dumps(v, ensure_ascii=False)
        else:
            out[k] = v
    return out


def simplify_raw(raw: dict, source_file: str) -> Dict[str, Any]:
    outer = raw.get("data") or {}
    inner = outer.get("data") or {}
    items = as_list(inner.get("items"))
    first = item
ad_image_createSkill

Create ad-ready product images (single or collage) by back-solving sub-image sizes from target output ratio, grounding scene design with media_comprehension, generating images via image_generator with strict request params and actor-count control, and pairing each deliverable with a short social tagline for 小红书/抖音.

ad_video_createSkill

Create ad-ready product video from product images, with or without character/subject images. The workflow leverages AI-powered image composition, scene understanding, and video generation. Video prompts should follow commercial shot language—visual hooks, product presence, hero shots, detail showcase, function expression, and dynamic visuals.

agent-browserSkill

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

app_evaluatorSkill

A professional skill for App Evaluation (evaluating app's performance with score) and App Improvement (giving professional suggestions for improving the app's performance).

embedded-video-pip-smooth-playbackSkill

>-

last_7_days_newsSkill

Search and summarize the latest 7 days of AI news and X discussions using public sources plus browser-based X collection. Use for recent AI news, trends, X discussions, industry briefs, and summaries organized into hot topics, viewpoints, and opportunity areas.

media_comprehensionSkill

An intelligent assistant specialized in handling media files (images/audio/video). **Only for media file analysis**, does not handle document types.\n\n✅ Media files that can be processed:\n- Images: .jpg, .jpeg, .png, .gif, .bmp, .webp, .svg\n- Audio: .mp3, .wav, .m4a, .flac, .aac, .ogg\n- Video: .mp4, .avi, .mov, .mkv, .webm, .flv\n\n❌ Files that cannot be processed (please do not trigger this skill):\n- Documents: .pdf, .doc, .docx, .txt, .md, .rtf\n- Spreadsheets: .xlsx, .xls, .csv, .tsv\n- Presentations: .pptx, .ppt, .key\n- Code: .py, .js, .ts, .java, .cpp, .go, .rs\n- Archives: .zip, .tar, .gz, .rar, .7z\n- Executables: .exe, .bin, .app, .dmg\n- Databases: .db, .sqlite, .sql\n- Configuration files: .json, .xml, .yaml, .yml, .toml, .ini\n- Web pages: .html, .htm, .css\n\n**Trigger conditions**: When the user explicitly requests to analyze image/audio/video content, or when the file extension belongs to the aforementioned media types.".

optimizerSkill

Analyzes and automatically optimizes existing agents by improving system prompts and tool configuration.