Skill4.3k repo starsupdated yesterday

panning-for-gold

Panning for Gold extracts and evaluates ideas from unstructured content like voice transcripts, brain dumps, and stream-of-consciousness notes. Use it when processing multi-topic captures that need systematic thread identification, deep brainstorming on promising concepts, and permanent storage of findings. The skill performs three phases: extracting every idea thread without filtering, evaluating high-signal ones, then synthesizing results into actionable outcomes saved to project files.

View source Repository: OB1

Install in Claude Code

Copy

git clone --depth 1 https://github.com/NateBJones-Projects/OB1 /tmp/panning-for-gold && cp -r /tmp/panning-for-gold/skills/panning-for-gold ~/.claude/skills/panning-for-gold

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Panning for Gold

## Overview

Transform raw brain dumps into evaluated, actionable idea inventories. Three phases: **Extract** every thread without filtering, **Evaluate** the highest-signal ones, then **Synthesize** into a permanent gold-found file.

**Core principle:** Every line gets examined. Nothing is dismissed as noise on the first pass. Personal threads, half-formed thoughts, and tangential observations often contain the highest-signal ideas.

## When to Use

- Voice transcripts (multi-speaker, timestamped)
- Stream-of-consciousness notes
- Brain dump markdown exports from ChatGPT/Gemini/Claude
- Any document where the user says "process this" or "what's in here"
- Multi-topic conversations that need thread extraction

## Critical Rules (Learned from Production Use)

These rules exist because they've been violated and caused wasted work:

1. **SAVE EVERYTHING TO PERMANENT FILES.** Phase 1 inventory, Phase 2 evaluations, and Phase 3 synthesis ALL get saved to files in the project's docs directory. Never rely on agent memory or temp task outputs surviving compaction.

2. **SUMMARIES FIRST, TRANSCRIPT SECOND.** If a summary/notes file exists alongside a transcript, use the summary as the primary extraction source. Only read the full transcript for: (a) exact quotes to support threads, (b) verifying completeness on the second pass. This saves 10-20K tokens per scan.

3. **EVALUATORS WRITE TO FILES.** Every background evaluator agent MUST write its evaluation to a permanent file (e.g., `docs/meetings/evaluations/YYYY-MM-DD-{slug}.md`) as part of its task. Do not depend on collecting agent return values.

4. **SYNTHESIS HAPPENS INLINE.** Do not dispatch a separate agent for synthesis. Write the gold-found file yourself after evaluators finish. If evaluators disappear (compaction, task ID loss), write the synthesis from your own reading.

5. **TWO PASSES ON TRANSCRIPTS.** Always run Phase 1 twice. First pass uses summary + targeted transcript reads. Second pass is a verification scan for missed threads. Present both inventories merged.

## Process

```dot
digraph panning {
    "Receive raw input" [shape=box];
    "Save raw input to file" [shape=box, style=bold];
    "Read summary first (if exists)" [shape=box];
    "PHASE 1a: Extract from summary" [shape=box];
    "PHASE 1b: Verify against transcript" [shape=box];
    "Save inventory to file" [shape=box, style=bold];
    "Present to user" [shape=box];
    "User confirms?" [shape=diamond];
    "Targeted re-read of transcript" [shape=box];
    "PHASE 2: Evaluate top threads" [shape=box];
    "Evaluators write to files" [shape=box, style=bold];
    "PHASE 3: Write gold-found file" [shape=box, style=bold];
    "Update skill lessons" [shape=box];

    "Receive raw input" -> "Save raw input to file";
    "Save raw input to file" -> "Read summary first (if exists)";
    "Read summary first (if exists)" -> "PHASE 1a: Extract from summary";
    "PHASE 1a: Extract from summary" -> "PHASE 1b: Verify against transcript";
    "PHASE 1b: Verify against transcript" -> "Save inventory to file";
    "Save inventory to file" -> "Present to user";
    "Present to user" -> "User confirms?";
    "User confirms?" -> "PHASE 2: Evaluate top threads" [label="yes"];
    "User confirms?" -> "Targeted re-read of transcript" [label="no"];
    "Targeted re-read of transcript" -> "Save inventory to file";
    "PHASE 2: Evaluate top threads" -> "Evaluators write to files";
    "Evaluators write to files" -> "PHASE 3: Write gold-found file";
    "PHASE 3: Write gold-found file" -> "Update skill lessons";
}
```

## Phase 0: Save Raw Input

**BEFORE ANY ANALYSIS:** Save the raw transcript/brain dump to a file if it's not already saved. Order: save first, analyze second. This rule exists because of two violations in a single session (2026-03-13).

File naming: `docs/meetings/YYYY-MM-DD-{source}-transcript.md` or `docs/brainstorming/YYYY-MM-DD-{topic}.md`

## Phase 0.5: Speaker Consolidation & Identification (Multi-Speaker Transcripts Only)

**BEFORE EXTRACTING THREADS:** Clean the speaker data. Voice transcripts with auto-generated speaker labels are actively misleading, not just unreliable. This is a data quality problem that must be solved before any analysis.

### Why This Exists

Added 2026-03-18 after a lunch meeting transcript: 10 speaker labels were generated for a 2-person conversation. The same person got different labels across scenes (office, car, restaurant), and different people shared labels. 40+ threads were attributed to the wrong person, turning pain points into pitches and vice versa. The entire inventory had to be re-done.

### The Problem (Quantified)

Typical voice transcription software (Otter, Plaud, phone recording apps) re-assigns speaker labels when:
- **Environment changes** (office to hallway to car to restaurant)
- **Background noise shifts** (quiet room vs. loud restaurant)
- **Volume/distance changes** (close mic vs. across table)
- **Brief pauses or interruptions** (any silence can trigger a new "speaker")

Result: A 2-person lunch meeting generated 10 speaker labels. Speaker 5 was attributed to BOTH participants at different points. The labels are worse than useless, they're actively wrong.

### Process

#### Step 1: Ask the user FIRST (10 seconds, saves 30 minutes)

Before reading a single line of transcript:
- "Who was present?"
- "Any other people who spoke briefly?" (receptionist, waiter, etc.)
- "What was the setting?" (helps predict environment-change label swaps)

#### Step 2: Speaker Label Audit (automated)

Run a quick frequency analysis on the raw transcript:

```
Count lines per speaker label
Sample 2-3 lines from each label
Compare: expected speakers vs. actual labels
```

If `number_of_labels > (expected_speakers * 2)`, the labels are fragmented and CANNOT be trusted for attribution. Flag this immediately.

#### Step 3: Build Anchor Lines

From memory, CRM, and context, identify "unmistakable" lines per person. These are l