baseline-selection
Baseline Selection identifies and establishes appropriate comparison points for experimental validation by selecting from three categories: state-of-the-art methods representing current best performance, simple baselines providing fundamental reference points, and internal baselines derived from existing project implementations. This skill is essential when designing experiments that require meaningful performance benchmarking across multiple difficulty levels and methodological approaches.
git clone --depth 1 https://github.com/yogsoth-ai/de-anthropocentric-research-engine /tmp/baseline-selection && cp -r /tmp/baseline-selection/skills/baseline-selection ~/.claude/skills/baseline-selectionSKILL.md
# SOP: Baseline Selection Select appropriate baselines that provide meaningful comparison points, covering SOTA, simple, and internal baselines. Subagent — spawned via subagent-spawning/spawn-agent skill.
Experiment-specific - summarize the DARE executor's research design into a clean research_result report, forced to write back into the spec file produced by formated-specs.
Experiment-specific - replaces writing-specs, emits DARE's 4-layer call plan as a clean research_graph schema. Last step forces load formated-result.
loss-1 judge - read a sample's full dialogue and decide whether the user simulator semantically enacted its Policy Card. check-blind.
loss-2 judge - pairwise quality comparison across the n rungs within one topic; decide monotonicity and endpoint separation. check-blind, D1-D5 only.
Strategy: 面对异常的最佳解释推理
Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
Map system architecture to ablatable units for ablation studies
Design ablation studies to isolate component contributions in ML systems