tikhub-xiaohongshu-search
This skill provides lightweight Xiaohongshu image search via TikHub's API, optimized for single-request usage with curl or minimal Python. It saves raw API JSON responses and includes an optional post-processor to convert results into CSV or simplified JSON formats. Use this when you need straightforward keyword-based image searches, page-based pagination, or structured note and image metadata without heavy wrapper dependencies.
git clone --depth 1 https://github.com/inclusionAI/AWorld /tmp/tikhub-xiaohongshu-search && cp -r /tmp/tikhub-xiaohongshu-search/aworld-skills/xiaohongshu_search ~/.claude/skills/tikhub-xiaohongshu-searchSKILL.md
# TikHub Xiaohongshu Search
## What this skill gives you
This skill is optimized for the **common case: one keyword search request**.
It provides:
1. **Minimal request patterns**
- `curl` for quickest validation
- tiny `httpx` example for people who prefer Python
2. **Raw JSON saving**
- save the full TikHub response after each request
- useful for audit, replay, and later post-processing
3. **One optional post-processor**
- `postprocess_xiaohongshu_raw.py`
- reads one raw file or a directory of raw files
- writes `xiaohongshu_search_summary.csv` and `xiaohongshu_search_summary.json`
4. **Optional pagination guidance**
- enough information for later page turning
- intentionally brief, not the main path
Does **not** import `TikHub-Multi-Functional-Downloader` or any other project package.
## API key requirement
This skill intentionally does **not** contain any API key.
Use one of these:
- environment variable: `TIKHUB_API_KEY`
- ask the user to provide an API key explicitly
If the key is missing, stop and ask for it instead of hardcoding one into scripts.
## Install
```bash
pip install httpx
```
Post-processor: **no extra packages**.
## API (for reference)
- Image search:
`GET https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=...&page=1&source=explore_feed`
- Header:
`Authorization: Bearer <API_KEY>`
## Notes from real requests
- In `curl`, Chinese keywords should be URL-encoded. Directly putting `壁纸` into the query caused `400`, while `%E5%A3%81%E7%BA%B8` succeeded.
- A working minimal first-page request was:
`keyword=%E5%A3%81%E7%BA%B8&page=1&source=explore_feed`
- The first-page response returns pagination context:
`search_id`, `search_session_id`, `word_request_id`, and `next_page`
- Search results are in:
`data.data.items`
- Useful nested sections include:
`image_info`, `note_info`, `share_info`, and `user_info`
---
## Preferred path: single request
### 1. Quickest: `curl`
First page:
```bash
curl --location --request GET "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=%E5%A3%81%E7%BA%B8&page=1&source=explore_feed" \
--header "Authorization: Bearer $TIKHUB_API_KEY"
```
Another keyword example:
```bash
curl --location --request GET "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=%E6%B2%BB%E6%84%88%E7%B3%BB&page=1&source=explore_feed" \
--header "Authorization: Bearer $TIKHUB_API_KEY"
```
### 2. Preferred Python pattern: tiny `httpx`
If the user wants Python, prefer a **small request snippet**, not a framework.
Search and save raw JSON:
```python
import json
import os
import urllib.parse
import httpx
api_key = os.getenv("TIKHUB_API_KEY", "").strip()
if not api_key:
raise SystemExit("Missing TIKHUB_API_KEY")
keyword = "壁纸"
url = "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images"
params = {
"keyword": keyword,
"page": 1,
"source": "explore_feed",
}
headers = {"Authorization": f"Bearer {api_key}", "Accept": "*/*"}
with httpx.Client(timeout=30.0, follow_redirects=True) as client:
raw = client.get(url, params=params, headers=headers).json()
safe_keyword = urllib.parse.quote(keyword, safe="")
with open(f"xiaohongshu_search_{safe_keyword}.json", "w", encoding="utf-8") as f:
json.dump(raw, f, ensure_ascii=False, indent=2)
items = raw.get("data", {}).get("data", {}).get("items", [])
for item in items[:5]:
note = item.get("note_info", {})
share = item.get("share_info", {})
user = item.get("user_info", {})
print(note.get("title", ""))
print(share.get("link", ""))
print(user.get("nickname", ""))
```
---
## Save raw JSON by default
For this workflow, the recommended default is:
1. request the API
2. save the **full** raw JSON immediately
3. print only a few useful fields for quick inspection
4. optionally run the post-processor later
Suggested file naming:
- first page raw: `search_<keyword>_page1_<request_id>.json`
- next page raw: `search_<keyword>_page2_<request_id>.json`
If `request_id` is unavailable, hash the keyword plus page number.
---
## Pagination
Only care about this if the user wants page 2 or beyond.
From the first response, keep these fields:
- `search_id`
- `search_session_id`
- `word_request_id`
- `next_page`
Then use them in the next request:
```bash
curl --location --request GET "https://api.tikhub.io/api/v1/xiaohongshu/app_v2/search_images?keyword=%E5%A3%81%E7%BA%B8&page=2&search_id=<search_id>&search_session_id=<search_session_id>&word_request_id=<word_request_id>&source=explore_feed" \
--header "Authorization: Bearer $TIKHUB_API_KEY"
```
If the endpoint behavior changes, trust the latest response fields over assumptions.
---
## Post-process raw JSON
Save as `postprocess_xiaohongshu_raw.py` (**stdlib only**).
Input:
- one raw search JSON file
- or a directory containing multiple raw JSON files
Output:
- `xiaohongshu_search_summary.csv`
- `xiaohongshu_search_summary.json`
```python
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import csv
import json
import os
import sys
from glob import glob
from typing import Any, Dict, List
def collect_inputs(path: str) -> List[str]:
if os.path.isfile(path):
return [path]
if os.path.isdir(path):
return sorted(glob(os.path.join(path, "*.json")))
raise FileNotFoundError(path)
def as_list(value: Any) -> List[dict]:
return value if isinstance(value, list) else []
def flatten_for_csv(row: Dict[str, Any]) -> Dict[str, Any]:
out: Dict[str, Any] = {}
for k, v in row.items():
if v is None:
out[k] = ""
elif isinstance(v, (dict, list)):
out[k] = json.dumps(v, ensure_ascii=False)
else:
out[k] = v
return out
def simplify_raw(raw: dict, source_file: str) -> Dict[str, Any]:
outer = raw.get("data") or {}
inner = outer.get("data") or {}
items = as_list(inner.get("items"))
first = itemCreate ad-ready product images (single or collage) by back-solving sub-image sizes from target output ratio, grounding scene design with media_comprehension, generating images via image_generator with strict request params and actor-count control, and pairing each deliverable with a short social tagline for 小红书/抖音.
Create ad-ready product video from product images, with or without character/subject images. The workflow leverages AI-powered image composition, scene understanding, and video generation. Video prompts should follow commercial shot language—visual hooks, product presence, hero shots, detail showcase, function expression, and dynamic visuals.
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
A professional skill for App Evaluation (evaluating app's performance with score) and App Improvement (giving professional suggestions for improving the app's performance).
>-
Search and summarize the latest 7 days of AI news and X discussions using public sources plus browser-based X collection. Use for recent AI news, trends, X discussions, industry briefs, and summaries organized into hot topics, viewpoints, and opportunity areas.
An intelligent assistant specialized in handling media files (images/audio/video). **Only for media file analysis**, does not handle document types.\n\n✅ Media files that can be processed:\n- Images: .jpg, .jpeg, .png, .gif, .bmp, .webp, .svg\n- Audio: .mp3, .wav, .m4a, .flac, .aac, .ogg\n- Video: .mp4, .avi, .mov, .mkv, .webm, .flv\n\n❌ Files that cannot be processed (please do not trigger this skill):\n- Documents: .pdf, .doc, .docx, .txt, .md, .rtf\n- Spreadsheets: .xlsx, .xls, .csv, .tsv\n- Presentations: .pptx, .ppt, .key\n- Code: .py, .js, .ts, .java, .cpp, .go, .rs\n- Archives: .zip, .tar, .gz, .rar, .7z\n- Executables: .exe, .bin, .app, .dmg\n- Databases: .db, .sqlite, .sql\n- Configuration files: .json, .xml, .yaml, .yml, .toml, .ini\n- Web pages: .html, .htm, .css\n\n**Trigger conditions**: When the user explicitly requests to analyze image/audio/video content, or when the file extension belongs to the aforementioned media types.".
Analyzes and automatically optimizes existing agents by improving system prompts and tool configuration.