Skill1.1k estrellas del repoactualizado 1mo ago

massgen-log-analyzer

The massgen-log-analyzer skill runs MassGen multi-agent experiments with automated instrumentation and generates structured analysis reports. Use it to debug agent coordination patterns, measure performance bottlenecks, query Logfire traces hierarchically, and produce markdown ANALYSIS_REPORT.md files documenting experiment results and behavior across multiple turns and attempts.

Ver fuente Repositorio: MassGen

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/massgen/MassGen /tmp/massgen-log-analyzer && cp -r /tmp/massgen-log-analyzer/massgen/skills/massgen-log-analyzer ~/.claude/skills/massgen-log-analyzer

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# MassGen Log Analyzer

This skill provides a structured workflow for running MassGen experiments and analyzing the resulting traces and logs using Logfire.

## Purpose

The log-analyzer skill helps you:
- Run MassGen experiments with proper instrumentation
- Query and analyze traces hierarchically
- Debug agent behavior and coordination patterns
- Measure performance and identify bottlenecks
- Improve the logging structure itself
- **Generate markdown analysis reports** saved to the log directory

## CLI Quick Reference

The `massgen logs` CLI provides quick access to log analysis:

### List Logs with Analysis Status
```bash
uv run massgen logs list                    # Show all recent logs with analysis status
uv run massgen logs list --analyzed         # Only logs with ANALYSIS_REPORT.md
uv run massgen logs list --unanalyzed       # Only logs needing analysis
uv run massgen logs list --limit 20         # Show more logs
```

### Generate Analysis Prompt
```bash
# Run from within your coding CLI (e.g., Claude Code) so it sees output
uv run massgen logs analyze                 # Analyze latest turn of latest log
uv run massgen logs analyze --log-dir PATH  # Analyze specific log
uv run massgen logs analyze --turn 1        # Analyze specific turn
```

The prompt output tells your coding CLI to use this skill on the specified log directory.

### Multi-Agent Self-Analysis
```bash
uv run massgen logs analyze --mode self                 # Run 3-agent analysis team (prompts if report exists)
uv run massgen logs analyze --mode self --force         # Overwrite existing report without prompting
uv run massgen logs analyze --mode self --turn 2        # Analyze specific turn
uv run massgen logs analyze --mode self --config PATH   # Use custom config
```

Self-analysis mode runs MassGen with multiple agents to analyze logs from different perspectives (correctness, efficiency, behavior) and produces a combined ANALYSIS_REPORT.md.

### Multi-Turn Sessions

MassGen log directories support multiple turns (coordination sessions). Each turn has its own `turn_N/` directory with attempts inside:

```text
log_YYYYMMDD_HHMMSS/
├── turn_1/                    # First coordination session
│   ├── ANALYSIS_REPORT.md     # Report for turn 1
│   ├── attempt_1/             # First attempt
│   └── attempt_2/             # Retry if orchestration restarted
├── turn_2/                    # Second coordination session (if multi-turn)
│   ├── ANALYSIS_REPORT.md     # Report for turn 2
│   └── attempt_1/
```

When analyzing, the `--turn` flag specifies which turn to analyze. Without it, the latest turn is analyzed.

## When to Use Logfire vs Local Logs

**Use Local Log Files When:**
- Analyzing command patterns and repetition (commands are in `streaming_debug.log`)
- Checking detailed tool arguments and outputs (in `coordination_events.json`)
- Reading vote reasoning and agent decisions (in `agent_*/*/vote.json`)
- Viewing the coordination flow table (in `coordination_table.txt`)
- Getting cost/token summaries (in `metrics_summary.json`)

**Use Logfire When:**
- You need precise timing data with millisecond accuracy
- Analyzing span hierarchy and parent-child relationships
- Finding exceptions and error stack traces
- Creating shareable trace links for collaboration
- Querying across multiple sessions (e.g., "find all sessions with errors")
- Real-time monitoring of running experiments

**Rate Limiting:** If Logfire returns a rate limit error, **wait up to 60 seconds and retry** rather than falling back to local logs. The rate limit resets quickly and Logfire data is worth waiting for when timing/hierarchy analysis is needed.

**Key Local Log Files:**

| File | Contains |
|------|----------|
| `status.json` | Real-time status with **agent reliability metrics** (enforcement events, buffer loss) |
| `metrics_summary.json` | Cost, tokens, tool stats, round history |
| `coordination_events.json` | Full event timeline with tool calls |
| `coordination_table.txt` | Human-readable coordination flow |
| `streaming_debug.log` | Raw streaming data including command strings |
| `agent_*/*/vote.json` | Vote reasoning and context |
| `agent_*/*/execution_trace.md` | **Full tool calls, arguments, results, and reasoning** - invaluable for debugging |
| `execution_metadata.yaml` | Config and session metadata |

**Execution Traces (`execution_trace.md`):**
These are the most detailed debug artifacts. Each agent snapshot includes an execution trace with:
- Complete tool calls with full arguments (not truncated)
- Full tool results (not truncated)
- Reasoning/thinking blocks from the model
- Timestamps and round markers

Use execution traces when you need to understand exactly what an agent did and why - they capture everything the agent saw and produced during that answer/vote iteration.

**Enforcement Reliability (`status.json`):**
The `status.json` file includes per-agent reliability metrics that track workflow enforcement events:

```json
{
  "agents": {
    "agent_a": {
      "reliability": {
        "enforcement_attempts": [
          {
            "round": 0,
            "attempt": 1,
            "max_attempts": 3,
            "reason": "no_workflow_tool",
            "tool_calls": ["search", "read_file"],
            "error_message": "Must use workflow tools",
            "buffer_preview": "First 500 chars of lost content...",
            "buffer_chars": 1500,
            "timestamp": 1736683468.123
          }
        ],
        "by_round": {"0": {"count": 2, "reasons": ["no_workflow_tool", "invalid_vote_id"]}},
        "unknown_tools": ["execute_command"],
        "workflow_errors": ["invalid_vote_id"],
        "total_enforcement_retries": 2,
        "total_buffer_chars_lost": 3000,
        "outcome": "ok"
      }
    }
  }
}
```

**Enforcement Reason Codes:**
| Reason | Description |
|--------|-------------|
| `no_workflow_tool` | Agent called tools but none were `vote` or `new_answer` |
| `no_tool_calls` | Agent provided text-onl

Del mismo repositorio

audio-generationSkill

Guide to audio generation and understanding in MassGen. Covers text-to-speech, music, sound effects, and audio understanding across ElevenLabs and OpenAI backends.

backend-integratorSkill

Complete guide for integrating a new LLM backend into MassGen. Use when adding a new provider (e.g., Codex, Mistral, DeepSeek) or when auditing an existing backend for missing integration points. Covers all ~15 files that need touching.

evolving-skill-creatorSkill

Guide for creating evolving skills - detailed workflow plans that capture what you'll do, what tools you'll create, and learnings from execution. Use this when starting a new task that could benefit from a reusable workflow.

file-searchSkill

This skill should be used when agents need to search codebases for text patterns or structural code patterns. Provides fast search using ripgrep for text and ast-grep for syntax-aware code search.

image-generationSkill

Guide to image generation and editing in MassGen. Use when creating images, editing existing images, iterating on image designs, or choosing between image backends (OpenAI, Google Gemini/Imagen, Grok, OpenRouter).

massgen-config-creatorSkill

Guide for creating properly structured YAML configuration files for MassGen. This skill should be used when agents need to create new configs for examples, case studies, testing, or demonstrating features.

massgen-develops-massgenSkill

Guide for using MassGen to develop and improve itself. This skill should be used when agents need to run MassGen experiments programmatically (using automation mode) OR analyze terminal UI/UX quality (using visual evaluation tools). These are mutually exclusive workflows for different improvement goals.

massgen-release-documenterSkill

Guide for following MassGen's release documentation workflow. This skill should be used when preparing release documentation, updating changelogs, writing case studies, or maintaining project documentation across releases.