adaptive-pair-selection
Adaptive pair selection iteratively identifies the most informationally valuable item comparisons, executes them, updates a rating model based on results, and monitors convergence of the ranking. Use this skill when building robust preference rankings where comparison budget is limited and you need confidence intervals around final positions.
git clone --depth 1 https://github.com/yogsoth-ai/de-anthropocentric-research-engine /tmp/adaptive-pair-selection && cp -r /tmp/adaptive-pair-selection/skills/adaptive-pair-selection ~/.claude/skills/adaptive-pair-selectionSKILL.md
# Adaptive Pair Selection Select the next comparison pair by information gain, execute the comparison, update ratings, and check for convergence. Repeats until the ranking stabilizes or the comparison budget is exhausted. ## Stages 1. **Select** — pair-selector identifies the pair whose comparison would most reduce uncertainty 2. **Compare** — comparison-executor produces a judgment with confidence and reasoning 3. **Update** — rating-update incorporates the new judgment into the rating model 4. **Check** — convergence-check determines if ranking has stabilized Loop stages 1-4 until convergence or budget exhaustion. ## Available SOPs | Stage | SOP | Input | Output | |-------|-----|-------|--------| | Select | pair-selector | current_ratings, comparison_history | next_pairs[] | | Compare | comparison-executor | pair, context | judgment | | Update | rating-update | judgment, current_ratings, method | updated_ratings | | Check | convergence-check | rating_history | converged, stability_score | ## Execution Guidance - Start with high-uncertainty pairs (largest sigma or most uncertain boundary) - For small N: may complete all pairs in first pass, then focus on inconsistencies - For large N: prioritize pairs near rank boundaries (positions k and k+1) - Track comparison count against budget; exit gracefully if budget hit - Pass full rating_history to convergence-check (not just latest snapshot) ## Minimum Yield - Global ranking + confidence intervals + convergence curve - Global ranking with confidence intervals for each position - Convergence curve showing stability score over iterations - Comparison log with all judgments made
Experiment-specific - summarize the DARE executor's research design into a clean research_result report, forced to write back into the spec file produced by formated-specs.
Experiment-specific - replaces writing-specs, emits DARE's 4-layer call plan as a clean research_graph schema. Last step forces load formated-result.
loss-1 judge - read a sample's full dialogue and decide whether the user simulator semantically enacted its Policy Card. check-blind.
loss-2 judge - pairwise quality comparison across the n rungs within one topic; decide monotonicity and endpoint separation. check-blind, D1-D5 only.
Strategy: 面对异常的最佳解释推理
Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
Map system architecture to ablatable units for ablation studies
Design ablation studies to isolate component contributions in ML systems