Skill746 estrellas del repoactualizado today

bernstein-quality

Bernstein Quality Metrics analyzes and compares code generation reliability across AI models by executing quality assessment scripts and displaying success rates, pass rates for linting and testing, and completion time distributions. Use this skill when users ask about agent reliability, model performance comparisons, test failure analysis, or want to see quality metrics dashboards for decision-making on model routing and optimization.

Ver fuente Repositorio: bernstein

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/sipyourdrink-ltd/bernstein /tmp/bernstein-quality && cp -r /tmp/bernstein-quality/packages/cursor-plugin/skills/bernstein-quality ~/.claude/skills/bernstein-quality

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Bernstein Quality Metrics

Analyze quality and reliability of agent-generated code.

## When to Use

- User asks "how reliable are the agents?" or "which model is best?"
- User wants success rates, pass rates, or completion time stats
- User asks about test failures or lint issues across models
- User says "show me quality metrics"

## Instructions

1. Run `scripts/quality.sh metrics` for overall quality metrics.
2. Run `scripts/quality.sh pass-rates` for lint/typecheck/test pass rates by model.
3. Run `scripts/quality.sh times` for completion time distributions.

4. Present a quality dashboard:

```
## Quality Dashboard

### Success Rate by Model
| Model | Tasks | Success | Fail | Rate |
|-------|-------|---------|------|------|
| claude-sonnet-4 | 24 | 22 | 2 | 91.7% |
| gpt-4.1 | 12 | 10 | 2 | 83.3% |

### Pass Rates
| Check | Overall | claude-sonnet-4 | gpt-4.1 |
|-------|---------|-----------------|---------|
| Lint | 96% | 98% | 92% |
| Type-check | 88% | 91% | 83% |
| Tests | 85% | 89% | 75% |

### Completion Times
| Percentile | Time |
|------------|------|
| p50 | 3m 20s |
| p90 | 8m 45s |
| p99 | 15m 12s |
```

5. Highlight any models with significantly lower pass rates.
6. Recommend model routing adjustments if one model consistently underperforms.

Del mismo repositorio

orchestratorSubagent

Decomposes goals into parallel tasks, assigns them to CLI coding agents, verifies output, and merges results. Use when a task is too large for a single agent.

runSlash Command

Start a Bernstein orchestration run with a goal

statusSlash Command

Show current Bernstein orchestration status

stopSlash Command

Gracefully stop a running Bernstein orchestration

bernstein-agentsSkill

bernstein-alertsSkill

bernstein-approveSkill

bernstein-costSkill