Skill869 estrellas del repoactualizado 1mo ago

data-stats-analysis

The data-stats-analysis skill enables statistical hypothesis testing including t-tests, ANOVA, correlation analysis, and multiple testing corrections using scipy and statsmodels libraries that execute locally. Use this skill when comparing group means, testing variable correlations, performing hypothesis testing with p-value calculations, applying correction methods like FDR or Bonferroni, or conducting non-parametric tests across any LLM provider.

Ver fuente Repositorio: ScienceClaw

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/beita6969/ScienceClaw /tmp/data-stats-analysis && cp -r /tmp/data-stats-analysis/skills/data-stats-analysis ~/.claude/skills/data-stats-analysis

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Statistical Analysis (Universal)

## Overview
This skill enables you to perform rigorous statistical analyses including t-tests, ANOVA, correlation analysis, hypothesis testing, and multiple testing corrections. Unlike cloud-hosted solutions, this skill uses standard Python statistical libraries (**scipy**, **statsmodels**, **numpy**) and executes **locally** in your environment, making it compatible with **ALL LLM providers** including GPT, Gemini, Claude, DeepSeek, and Qwen.

## When to Use This Skill
- Compare means between groups (t-tests, ANOVA)
- Test for correlations between variables
- Perform hypothesis testing with p-value calculation
- Apply multiple testing corrections (FDR, Bonferroni)
- Calculate statistical summaries and confidence intervals
- Test for normality and distribution fitting
- Perform non-parametric tests (Mann-Whitney, Kruskal-Wallis)

## How to Use

### Step 1: Import Required Libraries
```python
import numpy as np
import pandas as pd
from scipy import stats
from scipy.stats import ttest_ind, mannwhitneyu, pearsonr, spearmanr
from scipy.stats import f_oneway, kruskal, chi2_contingency
from statsmodels.stats.multitest import multipletests
from statsmodels.stats.proportion import proportions_ztest
import warnings
warnings.filterwarnings('ignore')
```

### Step 2: Two-Sample t-Test
```python
# Compare means between two groups
# group1, group2: arrays of numeric values

# Perform independent t-test
t_statistic, p_value = ttest_ind(group1, group2)

print(f"t-statistic: {t_statistic:.4f}")
print(f"p-value: {p_value:.4e}")

if p_value < 0.05:
    print("✅ Significant difference between groups (p < 0.05)")
else:
    print("❌ No significant difference (p >= 0.05)")

# With equal variance assumption check
# Levene's test for equal variances
_, levene_p = stats.levene(group1, group2)
if levene_p < 0.05:
    # Use Welch's t-test (unequal variances)
    t_stat, p_val = ttest_ind(group1, group2, equal_var=False)
    print(f"Welch's t-test p-value: {p_val:.4e}")
else:
    print("Equal variances assumed")
```

### Step 3: One-Way ANOVA
```python
# Compare means across multiple groups
# groups: list of arrays, e.g., [group1, group2, group3]

# Perform one-way ANOVA
f_statistic, p_value = f_oneway(*groups)

print(f"F-statistic: {f_statistic:.4f}")
print(f"p-value: {p_value:.4e}")

if p_value < 0.05:
    print("✅ Significant difference between groups (p < 0.05)")
    print("Note: Use post-hoc tests to identify which groups differ")
else:
    print("❌ No significant difference between groups")

# Post-hoc pairwise t-tests with Bonferroni correction
from itertools import combinations

group_names = ['Group A', 'Group B', 'Group C']
pairwise_results = []

for (name1, data1), (name2, data2) in combinations(zip(group_names, groups), 2):
    _, p = ttest_ind(data1, data2)
    pairwise_results.append({
        'comparison': f'{name1} vs {name2}',
        'p_value': p
    })

# Apply Bonferroni correction
pairwise_df = pd.DataFrame(pairwise_results)
n_tests = len(pairwise_df)
pairwise_df['p_adjusted'] = pairwise_df['p_value'] * n_tests
pairwise_df['p_adjusted'] = pairwise_df['p_adjusted'].clip(upper=1.0)

print("\nPairwise Comparisons (Bonferroni-corrected):")
print(pairwise_df)
```

### Step 4: Correlation Analysis
```python
# Pearson correlation (linear relationships)
r_pearson, p_pearson = pearsonr(variable1, variable2)

print(f"Pearson correlation: r = {r_pearson:.4f}, p = {p_pearson:.4e}")

# Spearman correlation (monotonic relationships, robust to outliers)
r_spearman, p_spearman = spearmanr(variable1, variable2)

print(f"Spearman correlation: ρ = {r_spearman:.4f}, p = {p_spearman:.4e}")

# Interpretation
if abs(r_pearson) < 0.3:
    strength = "weak"
elif abs(r_pearson) < 0.7:
    strength = "moderate"
else:
    strength = "strong"

direction = "positive" if r_pearson > 0 else "negative"
print(f"Interpretation: {strength} {direction} correlation")

if p_pearson < 0.05:
    print("✅ Statistically significant (p < 0.05)")
else:
    print("❌ Not statistically significant")
```

### Step 5: Multiple Testing Correction
```python
# Scenario: Testing 1000 genes for differential expression
# p_values: array of p-values from individual tests

# Method 1: Benjamini-Hochberg FDR correction (recommended)
reject_fdr, p_adjusted_fdr, _, _ = multipletests(p_values, alpha=0.05, method='fdr_bh')

# Method 2: Bonferroni correction (more conservative)
reject_bonf, p_adjusted_bonf, _, _ = multipletests(p_values, alpha=0.05, method='bonferroni')

# Create results DataFrame
results_df = pd.DataFrame({
    'gene': gene_names,
    'p_value': p_values,
    'q_value_fdr': p_adjusted_fdr,
    'p_adjusted_bonferroni': p_adjusted_bonf,
    'significant_fdr': reject_fdr,
    'significant_bonf': reject_bonf
})

# Summary
print(f"Original significant (p < 0.05): {(p_values < 0.05).sum()}")
print(f"Significant after FDR correction: {reject_fdr.sum()}")
print(f"Significant after Bonferroni correction: {reject_bonf.sum()}")

# Save results
results_df.to_csv('statistical_results.csv', index=False)
print("✅ Results saved to: statistical_results.csv")
```

### Step 6: Non-Parametric Tests
```python
# Use when data is not normally distributed

# Mann-Whitney U test (alternative to t-test)
u_statistic, p_value_mw = mannwhitneyu(group1, group2, alternative='two-sided')

print(f"Mann-Whitney U test:")
print(f"U-statistic: {u_statistic:.4f}")
print(f"p-value: {p_value_mw:.4e}")

# Kruskal-Wallis H test (alternative to ANOVA)
h_statistic, p_value_kw = kruskal(*groups)

print(f"\nKruskal-Wallis H test:")
print(f"H-statistic: {h_statistic:.4f}")
print(f"p-value: {p_value_kw:.4e}")
```

## Advanced Features

### Normality Testing
```python
from scipy.stats import shapiro, normaltest, kstest

# Test if data follows normal distribution

# Shapiro-Wilk test (best for n < 5000)
stat_sw, p_sw = shapiro(data)
print(f"Shapiro-Wilk test: W={stat_sw:.4f}, p={p_sw:.4e}")

# D'Agostino-Pearson test
stat_dp, p_dp = normaltest(data)
p

Del mismo repositorio

acp-routerSkill

Route plain-language requests for Pi, Claude Code, Codex, OpenCode, Gemini CLI, or ACP harness work into either OpenClaw ACP runtime sessions or direct acpx-driven sessions ("telephone game" flow). For coding-agent thread requests, read this skill first, then use only `sessions_spawn` for thread creation.

diffsSkill

Use the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.