Skip to main content
ClaudeWave
Skill146 repo starsupdated yesterday

design-study

>

Install in Claude Code
Copy
git clone --depth 1 https://github.com/Aperivue/medsci-skills /tmp/design-study && cp -r /tmp/design-study/skills/design-study ~/.claude/skills/design-study
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Design-Study Skill

## Purpose

This skill pressure-tests whether a study is answerable, interpretable, and defensible before large amounts of drafting or analysis work accumulate.

Use it when:
- a study question is known but the analysis plan is still fluid
- the user wants a methods sanity check
- a manuscript feels vulnerable to reviewer criticism
- a peer review requires explicit methodological diagnosis

---

## Communication Rules

- Communicate with the user in their preferred language.
- Use English for statistical, radiologic, and reporting-guideline terminology.
- Be direct about validity risks, but always propose the smallest feasible fix first.

---

## Core Review Questions

Always inspect these dimensions:

1. What is the exact research question?
2. What is the analysis unit: patient, lesion, exam, study, phase, report?
3. What is the index date or decision point?
4. How are inclusion and exclusion criteria applied?
5. Is there any information leakage?
6. What is the reference standard or endpoint definition?
7. What comparator is clinically meaningful?
8. What validation strategy is used?
9. What uncertainty reporting is required?
10. Which reporting guideline best fits?
11. Are exposure/outcome/covariate **definitions literature-grounded**, or invented ad-hoc from the data dictionary? If ad-hoc, defer to `/define-variables` before drafting Methods.

---

## Standard Output

```text
## Study Design Review
Question: ...
Study type: ...
Analysis unit: ...
Index date / prediction timepoint: ...

### Strengths
- ...

### Major validity risks
1. ...
2. ...

### Minimal fixes
- ...

### Reporting fit
- Recommended guideline: ...

### Decision
- Ready for analysis / Needs redesign / Drafting can proceed with limitations
```

---

## Workflow

### Phase 1: Reconstruct the study

Extract from protocol, draft, slides, tables, or notes:
- clinical problem
- intended use case
- population
- inputs
- outputs
- outcome definition
- timing of variable availability

**Gate:** Present the reconstructed study summary (question, analysis unit, intended use)
to the user. Confirm before proceeding — if the reconstruction is wrong, the entire
validity review will be misdirected.

### Phase 2: Check structural validity

#### A. Analysis unit

Look for mismatches such as:
- patient-level claim from lesion-level analysis
- exam-level split with patient overlap
- phase-level samples treated as independent

#### B. Leakage

Look for:
- postoperative features used for preoperative prediction
- normalization or thresholding performed before data split
- repeated exams across train/test
- reader annotations derived from outcome information
- **input-text contamination for NLP/LLM extraction tasks**: if the model input includes report
  sections such as clinical history, indication, impression, prior diagnosis, or referral text, confirm
  that those fields do not literally name or strongly imply the target label. If the target is already
  present in the supplied text, the task is information retrieval under label leakage, not phenotype
  inference; redesign the input mask, report a sensitivity analysis excluding leaky fields, or reframe the
  claim.
- **construct dependence** (a predictor that is a definitional component of the outcome). Two cases:
  (i) *mathematical definition* — an input that computes the outcome (when the outcome is HOMA-IR =
  f(fasting insulin, fasting glucose), those two inputs are not independent predictors); (ii)
  *near-tautological composite* — a ratio or score built from the outcome's defining components, which
  shows an inflated, near-circular association. Test: "could this predictor be derived, in whole or
  part, from the outcome's definition or the same measurement?" If yes, exclude it, or retain it only
  as a labeled calibration probe rather than a reported discovery.

#### F. Time origin & survivorship (incident / transition models)

For any time-to-event or incident/transition design, check before drafting:
- **Time origin per model.** Each incident model starts its at-risk clock at the correct origin. Watch for **immortal-time bias** (a span in which the event cannot occur, misattributed to one group) and **left-truncation / delayed entry** (subjects entering the risk set after the origin).
- **Mediator-ascertainment-window survivorship.** A "progressor" / transition label that is conditional on *surviving to* a later ascertainment (a second scan, a follow-up visit) is survivorship-biased; plan a landmark time or an explicit intermediate-state (multistate / illness-death) model.
- **Primary-analysis-set selection.** If the primary will not be the full cohort (e.g., complete-case while a large fraction is missing), pre-specify the selection justification and a MAR rationale; do not let the complete-case model become primary because it is the significant one (an outcome-dependent choice).
- A design that cannot yet answer these should say so honestly — but note that at review time a Methods/Limitations admission that the issue was *"not formally assessed"* is escalated to a MAJOR by the survival probe (S1), not waved through as a limitation.

#### C. Reference standard

Check:
- who established ground truth
- when it was established
- whether blinding was possible
- whether only a subset had gold standard verification
- **Construct ↔ nominal-definition match.** Does the exposure/finding *construct* stay inside its stated definition, or does it quietly exceed it? An "incidentaloma" defined as an *indeterminate* finding must not include frank malignancy reads; a label that overshoots its definition inflates the apparent cohort and breaks the κ. For each construct, restate the nominal definition and confirm every included case satisfies it.
- **Per-flag reference-standard concordance.** When the index finding is flagged against a reference standard, report the concordance *per flag category* (not just overall). A construct where a large fraction of flags do not mat
skillsSkill
academic-aioSkill

Medical AI paper optimization for AI search engines (Perplexity, ChatGPT web, Elicit, Consensus, SciSpace) and RAG-based literature tools. Applies when drafting or reviewing titles, abstracts, structured summary boxes (Key Points / Research in Context / Plain-Language Summary), manuscripts for high-impact medical AI journals (Lancet Digital Health, Radiology, Radiology-AI, npj Digital Medicine, Nature Medicine), preprints (medRxiv/arXiv), GitHub README + CITATION.cff + Zenodo archives, and Hugging Face model/dataset cards. Integrates TRIPOD+AI, CLAIM 2024, STARD-AI, TRIPOD-LLM, DECIDE-AI reporting requirements with generative engine optimization (GEO) principles. Produces a visible pass/fail checklist.

add-journalSkill

>

analyze-statsSkill

Statistical analysis for medical research papers. Generates reproducible Python/R code with publication-ready tables and figures. Supports diagnostic accuracy, inter-rater agreement, meta-analysis, survival analysis, survey data, group comparisons, regression, propensity score, and repeated measures.

author-strategySkill

PubMed author profile analysis. Author name → PubMed fetch → study type classification → visualization → strategy report.

batch-cohortSkill

Generate N analysis scripts from a single methodology template × multiple exposure/outcome combinations. The "80-person team" pattern — same validated method, swap variables only. Produces batch R/Python code + summary matrix.

calc-sample-sizeSkill

>

check-reportingSkill

Check manuscript compliance with medical research reporting guidelines. Supports 32 guidelines including STROBE, CONSORT, STARD, STARD-AI, TRIPOD, TRIPOD+AI, ARRIVE, PRISMA, PRISMA-DTA, PRISMA-P, CARE, SPIRIT, CLAIM, MI-CLEAR-LLM, SQUIRE 2.0, CLEAR, MOOSE, GRRAS, SWiM, AMSTAR 2, and risk of bias tools (QUADAS-2, QUADAS-C, RoB 2, ROBINS-I, ROBINS-E, ROBIS, ROB-ME, PROBAST, PROBAST+AI, NOS, COSMIN, RoB NMA). Generates item-by-item assessment with PRESENT/MISSING/PARTIAL status.