massgen-develops-massgen
MassGen Develops MassGen provides two testing workflows for MassGen improvement. Use Automation Mode for programmatic backend testing, which runs silently and generates parseable logs with status.json monitoring and exit codes. Use Visual Evaluation for terminal UI testing. Automation Mode outputs clean results avoiding ANSI codes, displays the log directory path on first run, and supports parallel execution through safe workspace isolation with workspace-specific exit codes indicating configuration, execution, timeout, or interruption errors.
git clone --depth 1 https://github.com/massgen/MassGen /tmp/massgen-develops-massgen && cp -r /tmp/massgen-develops-massgen/massgen/skills/massgen-develops-massgen ~/.claude/skills/massgen-develops-massgenSKILL.md
# MassGen Develops MassGen
This skill provides guidance for using MassGen to develop and improve itself. Choose the appropriate workflow based on what you're testing.
## Two Workflows
1. **Automation Mode** - Test backend functionality, coordination logic, agent responses
2. **Visual Evaluation** - Test terminal display, colors, layout, UX
---
## Workflow 1: Automation Mode
Use this to test functionality without visual inspection. Ideal for programmatic testing.
### Running MassGen with Automation
Run MassGen in the background (exact mechanism depends on your tooling):
```bash
uv run massgen --automation --config massgen/configs/basic/multi/two_agents_gemini.yaml "What is 2+2?"
```
**For MassGen agents**: Use `custom_tool__start_background_tool` targeting `mcp__command_line__execute_command`, then poll with `custom_tool__get_background_tool_status` / `custom_tool__get_background_tool_result`.
**For Claude Code**: Use Bash tool's `run_in_background` parameter.
### Why Automation Mode
| Feature | Benefit |
|---------|---------|
| Clean output | ~10 parseable lines vs 3,000+ ANSI codes |
| LOG_DIR printed | First line shows log directory path |
| status.json | Real-time monitoring file |
| Exit codes | 0=success, 1=config, 2=execution, 3=timeout, 4=interrupted |
| Workspace isolation | Safe parallel execution |
### Expected Output
```
LOG_DIR: .massgen/massgen_logs/log_20251120_143022_123456
STATUS: .massgen/massgen_logs/log_20251120_143022_123456/status.json
🤖 Multi-Agent Mode
Agents: gemini-2.5-pro1, gemini-2.5-pro2
Question: What is 2+2?
============================================================
QUESTION: What is 2+2?
[Coordination in progress - monitor status.json for real-time updates]
WINNER: gemini-2.5-pro1
DURATION: 33.4s
ANSWER_PREVIEW: The answer is 4.
COMPLETED: 2 agents, 35.2s total
```
Parse `LOG_DIR` from the first line to find the log directory.
### Monitoring Progress
Read the status.json file (updated every 2 seconds):
```bash
cat .massgen/massgen_logs/log_20251120_143022_123456/status.json
```
**Key fields:**
```json
{
"coordination": {
"completion_percentage": 65,
"phase": "enforcement"
},
"results": {
"winner": null // null = running, "agent_id" = done
},
"agents": {
"agent_a": {
"status": "streaming",
"error": null
}
}
}
```
**Agent status values:** `waiting`, `streaming`, `answered`, `voted`, `completed`, `error`
### Reading Results
After completion (exit code 0):
```bash
# Read final answer
cat [log_dir]/final/[winner]/answer.txt
```
### Timing Expectations
- **Standard tasks**: 2-10 minutes
- **Complex/meta tasks**: 10-30 minutes
- **Check if stuck**: Read status.json - if `completion_percentage` increases, it's working
### Advanced: Multiple Background Monitors
You can create multiple background monitoring tasks that run independently alongside the main MassGen process. Each monitor can track different aspects and write to separate log files for later inspection.
#### Approach
Create small Python scripts that run in background shells. Each script:
- Monitors a specific aspect (tokens, errors, progress, coordination, etc.)
- Writes timestamped data to its own log file
- Runs in a loop with `sleep()` intervals
- Can be checked anytime without blocking the main task
#### Example Monitor Scripts
**Token Usage Monitor** (`token_monitor.py`):
```python
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1]) # Pass LOG_DIR as argument
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
with open("token_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
log.write(f"Tokens: {data.get('total_tokens_used', 0)}\n")
log.write(f"Cost: ${data.get('total_cost', 0):.4f}\n\n")
time.sleep(5)
```
**Error Monitor** (`error_monitor.py`):
```python
import time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if log_dir.exists():
with open("error_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
errors = []
for logfile in log_dir.glob("*.log"):
with open(logfile) as f:
for line in f:
if any(x in line.lower() for x in ['error', 'warning', 'failed']):
errors.append(line.strip())
log.write('\n'.join(errors[-5:]) if errors else "No errors\n")
log.write("\n")
time.sleep(5)
```
**Progress Monitor** (`progress_monitor.py`):
```python
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
with open("progress_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
progress = data.get('completion_percentage', 0)
active = sum(1 for a in data.get('agents', {}).values()
if a.get('status') == 'active')
log.write(f"Progress: {progress}% Active agents: {active}\n\n")
time.sleep(5)
```
**Coordination Monitor** (`coordination_monitor.py`):
```python
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
coord = data.get('coordination', {})
with open("coordination_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
log.write(f"Phase: {coord.get('phase', 'unknown')}\n")
log.write(f"Round: {coord.get('round', 0)}\n")
log.write(f"Total answers: {coord.get('total_answers', 0)}\n\n")
time.sleep(5)
```
#### Workflow
1. **LaunchGuide to audio generation and understanding in MassGen. Covers text-to-speech, music, sound effects, and audio understanding across ElevenLabs and OpenAI backends.
Complete guide for integrating a new LLM backend into MassGen. Use when adding a new provider (e.g., Codex, Mistral, DeepSeek) or when auditing an existing backend for missing integration points. Covers all ~15 files that need touching.
Guide for creating evolving skills - detailed workflow plans that capture what you'll do, what tools you'll create, and learnings from execution. Use this when starting a new task that could benefit from a reusable workflow.
This skill should be used when agents need to search codebases for text patterns or structural code patterns. Provides fast search using ripgrep for text and ast-grep for syntax-aware code search.
Guide to image generation and editing in MassGen. Use when creating images, editing existing images, iterating on image designs, or choosing between image backends (OpenAI, Google Gemini/Imagen, Grok, OpenRouter).
Guide for creating properly structured YAML configuration files for MassGen. This skill should be used when agents need to create new configs for examples, case studies, testing, or demonstrating features.
Run MassGen experiments and analyze logs using automation mode, logfire tracing, and SQL queries. Use this skill for performance analysis, debugging agent behavior, evaluating coordination patterns, and improving the logging structure, or whenever an ANALYSIS_REPORT.md is needed in a log directory.
Guide for following MassGen's release documentation workflow. This skill should be used when preparing release documentation, updating changelogs, writing case studies, or maintaining project documentation across releases.