Skip to main content
ClaudeWave
Skill1.4k estrellas del repoactualizado 28d ago

machine-ops

The machine-ops Claude Code skill provides lifecycle management (restart, pause, resume), real-time metrics analysis, and backup operations for server instances. Load this skill when diagnosing server health issues, monitoring system performance trends, or managing machine state changes. It includes a structured diagnostic framework that progressively investigates SSH connectivity, resource utilization, Docker health, system logs, and network status to identify root causes.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/nixopus/nixopus /tmp/machine-ops && cp -r /tmp/machine-ops/api/skills/machine-ops ~/.claude/skills/machine-ops
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Machine Operations

## Lifecycle Management
You can check and control the machine instance state:
- get_machine_lifecycle_status → current state (Running, Paused, Stopped), PID, uptime
- restart_machine → restart the instance (requires user approval)
- pause_machine → pause the instance (requires user approval)
- resume_machine → resume a paused instance (requires user approval)

Always check get_machine_lifecycle_status before performing restart/pause/resume.

## Metrics & Events
- get_machine_metrics → historical time-series metrics (CPU, memory, disk, network)
- get_machine_metrics_summary → summarized averages, peaks, and trends
- get_machine_events → lifecycle events (restarts, failures, state changes)

Use metrics for trend analysis and incident correlation. Use get_machine_stats for a point-in-time snapshot.

## Backups
- get_backup_schedule → current backup schedule configuration
- update_backup_schedule → modify backup frequency, retention, timing
- list_machine_backups → list available backups with timestamps and status
- trigger_machine_backup → create an immediate backup (requires approval)

## Diagnostic Layers (IN ORDER, stop on root cause)
1. get_servers_ssh_status → reachable?
2. get_machine_stats → CPU, RAM, disk, load, uptime
3. Anomalies: mem>90% → host_exec "ps aux --sort=-%mem | head -20". disk>85% → "du -sh /var/lib/docker/* 2>/dev/null | sort -rh | head -10". cpu>80% → "ps aux --sort=-%cpu | head -20". load>2x cores → overloaded.
4. Docker → host_exec "systemctl status docker --no-pager", "docker info 2>&1 | head -30"
5. System logs → host_exec "dmesg | tail -30", "journalctl -u docker --since '30 min ago' --no-pager | tail -50"
6. Proxy/domain: follow domain-tls-routing skill. Caddy status/logs/validate via host_exec. For domain CRUD or reachability checks, defer to Infrastructure Agent.
7. Network → host_exec "ss -tlnp"
8. Cleanup → host_exec "docker system df"

Root cause: bold summary, evidence in code block, fix in 1-2 sentences.
No anomalies: report healthy with key metrics.
api-catalogSkill

Reference for all Nixopus API operations callable via nixopus_api(method, path, body)

caddyfile-generationSkill

Generate Caddyfile configurations for static sites and reverse proxies — SPA fallback routing, cache headers, compression, redirects, and error pages. Use when deploying a static site that needs custom Caddy configuration, or when the user needs SPA routing, caching, or redirect rules.

compose-setupSkill

Generate docker-compose.yml for multi-service setups including databases, caches, and service dependencies. Use when the app needs a database, cache, message broker, or has multiple independently deployable services.

container-resource-tuningSkill

Size container memory and CPU limits, diagnose OOM kills and CPU throttling, and recommend resource adjustments by ecosystem. Use when containers are being OOM-killed, running slowly, or when setting initial resource limits for a deployment.

cpp-deploySkill

Build and deploy C/C++ applications — CMake, Meson, Ninja, and Dockerfile patterns. Use when deploying a C or C++ project, or when CMakeLists.txt or meson.build is detected.

database-migrationSkill

Run database migrations safely during deployment — framework-specific commands, pre-deploy vs post-deploy timing, health gates, and rollback strategies. Use when the app has a database migration system and needs migrations run during deployment.

deno-deploySkill

Build and deploy Deno applications — version detection, dependency caching, and Dockerfile patterns. Use when deploying a Deno project, or when deno.json or deno.jsonc is detected.

deploy-delegationSkill

Sub-agent routing table — which agent handles diagnostics, machine health, infrastructure, GitHub, billing, and notifications. Load when the current task is not a direct deployment.