ablation-design
The ablation-design skill provides a structured methodology for isolating individual component contributions in machine learning systems through systematic removal, replacement, or conditional testing. Use it when you need to understand which system parts drive performance, detect interactions between components, or verify that architectural decisions produce measurable improvements beyond their individual effects.
git clone --depth 1 https://github.com/yogsoth-ai/de-anthropocentric-research-engine /tmp/ablation-design && cp -r /tmp/ablation-design/skills/ablation-design ~/.claude/skills/ablation-designSKILL.md
# Strategy: Ablation Design **Question**: What does each component contribute? ## Methodology - **Systematic Ablation** (Newell 1974): Remove one component at a time, measure degradation. - **Replacement Ablation**: Replace component with simpler alternative to isolate contribution. - **Combinatorial Ablation** (ABLATOR): Test component subsets to detect interaction effects. - **Conditional Ablation**: Ablate components under specific data conditions to find context-dependent contributions. ## Execution Flow 1. **ablation-component-mapping** → Map system architecture to ablatable units 2. **baseline-selection** → Select full-system and minimal-system anchors 3. **metric-specification** → Define metrics that capture component contribution 4. **sample-size-estimation** → Determine runs needed for reliable delta estimation 5. **statistical-method-selection** (tactic) → Choose appropriate significance tests for deltas ## Budget Gate | Ablation Type | Components (N) | Min Runs | When to Use | |---------------|---------------|----------|-------------| | Systematic (leave-one-out) | 3-8 | N + 2 | Standard component analysis | | Replacement | 3-8 | 2N + 2 | Need to distinguish "removal" vs "simplification" | | Combinatorial (selected) | 4-6 | ~2N | Suspected interactions between components | | Combinatorial (full) | 3-4 | 2^N | Small systems, need complete picture | | Conditional | 3-6 | N * conditions | Context-dependent contributions |
Experiment-specific - summarize the DARE executor's research design into a clean research_result report, forced to write back into the spec file produced by formated-specs.
Experiment-specific - replaces writing-specs, emits DARE's 4-layer call plan as a clean research_graph schema. Last step forces load formated-result.
loss-1 judge - read a sample's full dialogue and decide whether the user simulator semantically enacted its Policy Card. check-blind.
loss-2 judge - pairwise quality comparison across the n rungs within one topic; decide monotonicity and endpoint separation. check-blind, D1-D5 only.
Strategy: 面对异常的最佳解释推理
Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
Map system architecture to ablatable units for ablation studies
Remove components one by one from a system, record the response/impact of each removal.