statistical-analysis
>-
git clone --depth 1 https://github.com/jaechang-hits/SciAgent-Skills /tmp/statistical-analysis && cp -r /tmp/statistical-analysis/skills/biostatistics/statistical-analysis ~/.claude/skills/statistical-analysisSKILL.md
# Statistical Analysis ## Overview Statistical analysis is the systematic process of selecting appropriate tests, verifying assumptions, quantifying effect magnitudes, and reporting results. This knowhow guides test selection, assumption diagnostics, and APA-style reporting for frequentist and Bayesian analyses in academic research. ## Key Concepts ### Frequentist vs Bayesian Framework | Aspect | Frequentist | Bayesian | |--------|-------------|----------| | Core output | p-value, confidence interval | Posterior distribution, credible interval | | Interpretation | "How likely is this data if H0 is true?" | "How likely is H1 given the data?" | | Null support | Cannot support H0 (only fail to reject) | Can quantify evidence for H0 via Bayes Factor | | Prior info | Not used | Incorporated via prior distributions | | Sample size | Requires adequate power | Works with any sample size | | Best for | Standard analyses, large samples | Small samples, prior info, complex models | ### Statistical vs Practical Significance A statistically significant result (p < .05) may be trivially small in practice. Always report: - **Effect size**: Magnitude of the effect (Cohen's d, eta-squared, r, R-squared) - **Confidence interval**: Precision of the estimate - **Context**: Clinical/practical relevance in the domain ### Common Effect Sizes | Test | Effect Size | Small | Medium | Large | |------|-------------|-------|--------|-------| | t-test | Cohen's d | 0.20 | 0.50 | 0.80 | | t-test (small n) | Hedges' g | 0.20 | 0.50 | 0.80 | | ANOVA | eta-squared partial | 0.01 | 0.06 | 0.14 | | ANOVA | omega-squared | 0.01 | 0.06 | 0.14 | | Correlation | r | 0.10 | 0.30 | 0.50 | | Regression | R-squared | 0.02 | 0.13 | 0.26 | | Regression | f-squared | 0.02 | 0.15 | 0.35 | | Chi-square | Cramer's V | 0.07 | 0.21 | 0.35 | | Chi-square 2x2 | phi coefficient | 0.10 | 0.30 | 0.50 | Cohen's benchmarks are guidelines, not rigid thresholds -- domain context always matters. ### Assumptions Overview Most parametric tests require: 1. **Independence**: Observations are independent of each other 2. **Normality**: Data (or residuals) are approximately normally distributed 3. **Homogeneity of variance**: Groups have similar variances (for group comparisons) 4. **Linearity**: Relationship between variables is linear (for regression) When assumptions are violated: - **Normality violated, n > 30**: Proceed -- parametric tests are robust with large samples - **Normality violated, n < 30**: Use non-parametric alternative - **Variance heterogeneity**: Use Welch's correction (t-test) or Welch's ANOVA - **Linearity violated**: Add polynomial terms, transform variables, or use GAMs ### Test-Specific Assumption Workflows **T-test assumptions**: (1) Check normality per group with Shapiro-Wilk + Q-Q plots. (2) Check homogeneity with Levene's test. (3) If normality violated: Mann-Whitney U (independent) or Wilcoxon signed-rank (paired). If variance heterogeneity: use Welch's t-test. **ANOVA assumptions**: (1) Normality per group. (2) Homogeneity via Levene's test. (3) For repeated measures: check sphericity (Mauchly's test); if violated, apply Greenhouse-Geisser (epsilon < 0.75) or Huynh-Feldt (epsilon > 0.75) correction. (4) If normality violated: Kruskal-Wallis (independent) or Friedman (repeated). **Linear regression assumptions**: (1) Linearity via residuals-vs-fitted plot. (2) Independence via Durbin-Watson test (1.5-2.5 acceptable). (3) Homoscedasticity via Breusch-Pagan test + scale-location plot. (4) Normality of residuals via Q-Q plot + Shapiro-Wilk. (5) Multicollinearity via VIF (>10 = severe, >5 = moderate). **Logistic regression assumptions**: (1) Independence. (2) Linearity of log-odds with continuous predictors (Box-Tidwell test). (3) No perfect multicollinearity (VIF). (4) Adequate sample size (10-20 events per predictor minimum). ### Specialized Test Categories Beyond the main decision flowchart, several specialized test families address specific data types: **Survival / time-to-event analysis**: - **Log-rank test**: Compares survival curves between groups (non-parametric) - **Cox proportional hazards**: Models time-to-event with covariates; assumes proportional hazards - **Parametric survival models**: Weibull, exponential, log-normal for known distributional forms - Use when outcome is time until an event (death, relapse, failure) with possible censoring **Count outcome models**: - **Poisson regression**: For count data where mean approximately equals variance - **Negative binomial regression**: For overdispersed counts (variance > mean) - **Zero-inflated models**: For excess zeros beyond what Poisson/NB predicts - Use when outcome is a count (number of events, incidents, occurrences) **Agreement and reliability**: - **Cohen's kappa**: Inter-rater agreement for categorical ratings (2 raters) - **Fleiss' kappa / Krippendorff's alpha**: Agreement for >2 raters - **Intraclass correlation coefficient (ICC)**: Continuous ratings reliability - **Cronbach's alpha**: Internal consistency of multi-item scales - **Bland-Altman analysis**: Agreement between two measurement methods (continuous) - Use when assessing measurement reliability or inter-rater consistency **Categorical data extensions**: - **McNemar's test**: Paired binary outcomes (2x2) - **Cochran's Q test**: Paired binary outcomes (3+ conditions) - **Cochran-Armitage trend test**: Ordered categories in contingency tables ## Decision Framework ### Test Selection Flowchart ``` What is your research question? | +-- Comparing GROUPS on a continuous outcome? | | | +-- How many groups? | | +-- 2 groups | | | +-- Independent -> Independent t-test (or Mann-Whitney U) | | | +-- Paired/repeated -> Paired t-test (or Wilcoxon signed-rank) | | +-- 3+ groups | | +-- Independent -> One-way ANOVA (or Kruskal-Wallis) | | +-- Repeated -> Repeated-measures ANOVA (or Friedman) | | | +-- Multiple factors? -> Factorial ANOVA / Mi
|
Opentrons Protocol API v2 for OT-2/Flex: Python protocols for pipetting, serial dilutions, PCR, plate replication; control thermocycler, heater-shaker, magnetic, temperature modules. Use pylabrobot for multi-vendor.
Interactive visualization with Plotly. 40+ chart types (scatter, line, heatmap, 3D, geographic) with hover, zoom, pan. Two APIs: Plotly Express (DataFrame) and Graph Objects (fine control). For static publication figures use matplotlib; for statistical grammar use seaborn.
Statistical visualization on matplotlib + pandas. Distributions (histplot, kdeplot, violin, box), relational (scatter, line), categorical, regression, correlation heatmaps. Auto aggregation/CIs. Use plotly for interactive; matplotlib for low-level.
Best practices for single-cell RNA-seq cell type annotation including marker-based, reference-based, and automated classification approaches.
Bayesian modeling with PyMC 5: priors, likelihood, NUTS/ADVI sampling, diagnostics (R-hat, ESS), LOO/WAIC comparison, prediction. Hierarchical, logistic, GP variants; predictive checks.
Time-to-event modeling with scikit-survival: Cox PH (elastic net), Random Survival Forests, Boosting, SVMs for censored data. C-index, Brier, time-dependent AUC; Kaplan-Meier, Nelson-Aalen, competing risks. Pipeline/GridSearchCV compatible. Use statsmodels for frequentist, pymc for Bayesian, lifelines for parametric.
Python statistical modeling: regression (OLS, WLS, GLM), discrete (Logit, Poisson, NegBin), time series (ARIMA, SARIMAX, VAR), with rigorous inference, diagnostics, and hypothesis tests. Use scikit-learn for ML; statistical-analysis for test choice.