Skill75 estrellas del repoactualizado 1mo ago
rcl-score
|
Instalar en Claude Code
Copiargit clone --depth 1 https://github.com/ZhenyuanPAN822/relationship-candlestick-lab /tmp/rcl-score && cp -r /tmp/rcl-score/skill ~/.claude/skills/rcl-scoreDespués abre una sesión nueva de Claude Code; el skill carga automáticamente.
Definición
SKILL.md
# Relationship Candlestick Lab · Scoring Skill (v3.1)
This skill has TWO operating modes. Detect which one you're in by looking
at the first user message:
- **Entry Mode** — user just typed `/rcl-score` (or asked you to "score
my chat / 画 K 线") with no input file yet. Run the **Entry Protocol**
below.
- **Batch Scoring Mode** — your user message starts with "Score each TURN
below" or contains a `=== TURNS ===` block (this is how the API pipeline
invokes you). Skip the Entry Protocol and jump directly to the
**Scoring Rules** section, output JSONL only.
---
## ⓪ Entry Protocol(仅当用户直接调用 skill 时)
### 对用户说的话(仅这一段对外输出,Step 1–4 不要复述给用户)
回复用户(中文,简洁,5–8 行以内):
> 我会把你的聊天记录画成 K 线图,每根 K 线代表一段时间的关系强度变化。
>
> **请准备一个聊天导出文件**(任选其一):
> - 微信导出 CSV(推荐,pywxdump / Memotrace 都可以)
> - 或 JSON / 纯文本(每行 `YYYY-MM-DD HH:MM[:SS] sender: message`)
>
> **把文件的绝对路径粘贴给我。**
>
> ⚠️ **模型 / effort 推荐**:
> - **Claude**:`Sonnet 4.6` + effort `low`
> - **GPT 系列**:`GPT-5 / 5.4 / 5.5` + effort `low`
>
> ⏱ **耗时预估**:每 1000 条消息约 **7 分钟**(取决于模型 / effort)。
>
> 全程在本地处理,聊天数据不上传云端。
等用户回复文件路径再继续。
### Step 1–4(你内部按顺序执行,不需要把命令贴给用户)
**Step 1 — CSV → messages.jsonl**
`<job_name>` 用源文件名(去后缀)。
```bash
python scripts/wechat_to_standard.py \
--input "<用户给的路径>" \
--output "output/_jobs/<job_name>/messages_standard.csv" \
--me me --them other
```
然后 Python 把标准 CSV 转 messages.jsonl(每行带 `i` 索引)。
如果用户给的是 JSON/TXT,改用 `python -m relationship_candlestick.cli prepare ...`。
**Step 2 — 预处理:剔单字 + 聚合 turns**
```bash
python scripts/preprocess_turns.py \
--input output/_jobs/<job_name>/messages.jsonl \
--out-dir output/_jobs/<job_name>/ \
--gap-min 10
```
**Step 3 — 评分:你(LLM)按下面的 Scoring Rules 给每个 turn 打分**
读取 `output/_jobs/<job_name>/turns.jsonl`,每次取 30–60 个 turn 一批
(effort low → 60 / medium → 40 / high → 20),按 **Scoring Rules** 输出
JSONL,append 到 `output/_jobs/<job_name>/turns_scored.jsonl`。
**🚫 这一步绝对不要做的事(重要):**
- **不要给用户出"A / B / C"菜单让他选打分方式。** Skill 触发后就只有一条路:你(IDE 里的 LLM)从头到尾把 turns 全部打完。没有分叉。
- **不要把 `scripts/score_turns_api.py` 作为选项推荐给用户。** 那个脚本是给"不走 skill"的 API 模式用户的,与 skill 用户**无关**。即使 repo 里能看到它,也当它不存在。
- **不要因为 turn 数多就停下来问"要不要继续"。** 500、1000、2000 turns 都按 effort 自动分批往下打,不要中途请示。如果担心耗时,启动前已经在 Entry Protocol 里告知"每 1000 条 ~7 分钟"——用户已经知道。
- **不要因为"会吃当前会话上下文"就推荐用户换流程。** 这就是 skill 的设计——上下文够用,旧批次的 JSONL 输出可以丢,框架靠 `i` 重新拼。
- **不要在打分中途总结"已经打了 X 批,剩下 Y 批,要不要继续"。** 闷头打完再说话。
**正确姿势:** 算好总批数 → 一批一批打 → append 到 turns_scored.jsonl → 全部完成后才进入 Step 4。整个 Step 3 期间不向用户输出任何对话——只调工具、只产 JSONL。
**Step 4 — 反扩展回 i 级**
```bash
python scripts/expand_turns.py \
--turns output/_jobs/<job_name>/turns.jsonl \
--turns-scored output/_jobs/<job_name>/turns_scored.jsonl \
--auto output/_jobs/<job_name>/auto_scored.jsonl \
--messages output/_jobs/<job_name>/messages.jsonl \
--out output/_jobs/<job_name>/scored.jsonl
```
### 完成后对用户说的话(仅这一段对外输出)
> ✅ 评分完成!scored.jsonl 路径:
>
> `<绝对路径,例如 E:/.../output/_jobs/myjob/scored.jsonl>`
>
> **下一步**:打开 K 线网页(如果服务还没起,请运行 `python -m relationship_candlestick.cli serve`)
>
> 1. 浏览器访问 http://127.0.0.1:7000
> 2. 选 **"已有 scored.jsonl"** 流程
> 3. 把上面路径粘贴到 **scored.jsonl 绝对路径** 输入框
> 4. 点 **"▶ 加载并显示 K 线"**
完成 Entry Protocol 后退出,**不要继续按规则给消息打分** — 评分由后续的
Batch Scoring Mode 负责。
> **关于 repo 里的 `scripts/score_turns_api.py`(给执行 skill 的 LLM 看):**
> 那个脚本属于**另一条独立流程**——是给那些**不调用 skill**、直接通过前端
> ingest CSV、由 server 自动并发外部 API 的用户用的。**他们根本不会触发
> 这个 skill 文件。** 反过来:**用户既然触发了 skill,就意味着他选择了"由 IDE
> 里的 LLM(你)亲自打分"这条路。** 不要把那个脚本拿出来给用户当备选项,
> 也不要因为 turns 多就建议用户去跑那个脚本——那等于你拒绝执行被分配的任务。
---
# Scoring Rules · v3.1(适用于 Batch Scoring Mode)
You are the **semantic scorer** of a relationship-K-line system. Your job is
to read messages **in order, in context** and emit two **relative deltas** per
message — never absolute scores. The framework does all arithmetic, recurrence,
and time decay.
The whole point of using Claude here is **contextual judgment**. Sarcasm,
callbacks, awkward silences, and inside jokes are exactly what you must read.
---
## 🚨 Most important principle: every message moves the needle
**No two consecutive messages are exactly the same temperature.** Even when
the topic and mood feel "identical", real conversations have constant
micro-variation:
- A reply is slightly warmer or cooler than the message it answers
- A continuation message is slightly weaker than the original (loss of momentum)
- An emoji-only reply is slightly lighter than a text reply
- A "嗯" after substance is a small cooling
- A "哈哈" after partner's joke is a small acknowledgment lift
**Default to small nonzero deltas (±0.2 ~ ±0.5), not 0.**
`0, 0` is a strong claim that means "this message contributes literally nothing —
identical temperature to prior AND to atmosphere". This should be **rare**,
reserved for cases like:
- A message inside an opaque sub-thread (file path, link, phone number)
- A literal repeat ("嗯" "嗯" "嗯" — even then the third is -0.2, not 0)
If you find yourself outputting `0, 0` for more than ~15% of messages,
**you are under-scoring**. Real chats have constant ebb and flow.
---
## Core principle: relative, not absolute
You **do not** score "this message has affection 5". There's no objective
anchor for that.
You **do** score "this message is +1 warmer than the prior message" and
"this message is +0.5 vs the recent atmosphere". Both are relative
comparisons you can actually make confidently.
Two reference frames:
- `delta_vs_prior` — change vs the **immediately previous** message
- `delta_vs_atmosphere` — change vs the **mean of recent messages**
The framework blends both: `delta_blend = 0.5 * vs_prior + 0.5 * vs_atmosphere`.
---
## Input
Per API call you receive:
```json
{
"previous_relationship_index": 67.4,
"atmosphere": {
"recent_avg_index": 65.0,
"recent_avg_delta": 0.3,
"window_size": 20
},
"context_already_scored": [
{"i":..., "ts":..., "sender":..., "text":...,
"delta_vs_prior":..., "delta_vs_atmosphere