assumption-perturbation
assumption-perturbation systematically identifies load-bearing assumptions in research arguments by extracting implicit premises, replacing each with its strongest plausible alternative, re-deriving conclusions under each negation, and ranking by sensitivity. Use this skill when evaluating the robustness of a theoretical framework, experimental design, or empirical claim to determine which foundational assumptions most critically drive the stated conclusions.
git clone --depth 1 https://github.com/yogsoth-ai/de-anthropocentric-research-engine /tmp/assumption-perturbation && cp -r /tmp/assumption-perturbation/skills/assumption-perturbation ~/.claude/skills/assumption-perturbationSKILL.md
# Assumption Perturbation Systematically perturb assumptions to identify which are load-bearing. ## Operations assumption-extraction → negation-definition → re-derivation → conclusion-sensitivity-measurement ## Available SOPs **Subagent:** assumption-extraction, negation-definition, re-derivation, conclusion-sensitivity-measurement **Shared:** assumption-surfacing **Import:** paper-research ## Execution Guidance Extract all assumptions (use shared SOP for initial surfacing), define weakest plausible alternative for each, re-derive the conclusion under each alternative, measure change magnitude and direction. Rank by sensitivity. Key principle: negation is not logical NOT — it is the strongest plausible alternative. "Data is normally distributed" negates to "data follows a heavy-tailed distribution" not "data is not normally distributed." ## Minimum Yield ``` <HARD-GATE> - Assumptions extracted: >= 5 - Negations defined: >= 5 - Re-derivations completed: >= 4 - Sensitivity rankings produced: >= 1 complete ranking </HARD-GATE> ```
Experiment-specific - summarize the DARE executor's research design into a clean research_result report, forced to write back into the spec file produced by formated-specs.
Experiment-specific - replaces writing-specs, emits DARE's 4-layer call plan as a clean research_graph schema. Last step forces load formated-result.
loss-1 judge - read a sample's full dialogue and decide whether the user simulator semantically enacted its Policy Card. check-blind.
loss-2 judge - pairwise quality comparison across the n rungs within one topic; decide monotonicity and endpoint separation. check-blind, D1-D5 only.
Strategy: 面对异常的最佳解释推理
Remove components one by one, observe system changes to reveal hidden dependencies and generate ideas from structural gaps.
Map system architecture to ablatable units for ablation studies
Design ablation studies to isolate component contributions in ML systems