Skill570 repo starsupdated today
bernstein-quality
Bernstein Quality Metrics analyzes and compares code generation reliability across AI models by executing quality assessment scripts and displaying success rates, pass rates for linting and testing, and completion time distributions. Use this skill when users ask about agent reliability, model performance comparisons, test failure analysis, or want to see quality metrics dashboards for decision-making on model routing and optimization.
Install in Claude Code
Copygit clone --depth 1 https://github.com/sipyourdrink-ltd/bernstein /tmp/bernstein-quality && cp -r /tmp/bernstein-quality/packages/cursor-plugin/skills/bernstein-quality ~/.claude/skills/bernstein-qualityThen start a new Claude Code session; the skill loads automatically.
Definition
SKILL.md
# Bernstein Quality Metrics Analyze quality and reliability of agent-generated code. ## When to Use - User asks "how reliable are the agents?" or "which model is best?" - User wants success rates, pass rates, or completion time stats - User asks about test failures or lint issues across models - User says "show me quality metrics" ## Instructions 1. Run `scripts/quality.sh metrics` for overall quality metrics. 2. Run `scripts/quality.sh pass-rates` for lint/typecheck/test pass rates by model. 3. Run `scripts/quality.sh times` for completion time distributions. 4. Present a quality dashboard: ``` ## Quality Dashboard ### Success Rate by Model | Model | Tasks | Success | Fail | Rate | |-------|-------|---------|------|------| | claude-sonnet-4 | 24 | 22 | 2 | 91.7% | | gpt-4.1 | 12 | 10 | 2 | 83.3% | ### Pass Rates | Check | Overall | claude-sonnet-4 | gpt-4.1 | |-------|---------|-----------------|---------| | Lint | 96% | 98% | 92% | | Type-check | 88% | 91% | 83% | | Tests | 85% | 89% | 75% | ### Completion Times | Percentile | Time | |------------|------| | p50 | 3m 20s | | p90 | 8m 45s | | p99 | 15m 12s | ``` 5. Highlight any models with significantly lower pass rates. 6. Recommend model routing adjustments if one model consistently underperforms.
More from this repository
orchestratorSubagent
Decomposes goals into parallel tasks, assigns them to CLI coding agents, verifies output, and merges results. Use when a task is too large for a single agent.
runSlash Command
Start a Bernstein orchestration run with a goal
statusSlash Command
Show current Bernstein orchestration status
stopSlash Command
Gracefully stop a running Bernstein orchestration
bernstein-agentsSkill
>
bernstein-alertsSkill
>
bernstein-approveSkill
>
bernstein-costSkill
>