Skip to main content
ClaudeWave
Skill0 repo starsupdated yesterday

vmware-avi

>

Install in Claude Code
Copy
git clone --depth 1 https://github.com/zw008/VMware-AVI /tmp/vmware-avi && cp -r /tmp/vmware-avi/skills/vmware-avi ~/.claude/skills/vmware-avi
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# VMware AVI

> **Disclaimer**: This is a community-maintained open-source project and is **not affiliated with, endorsed by, or sponsored by VMware, Inc. or Broadcom Inc.** "VMware", "NSX", and "AVI" are trademarks of Broadcom. Source code is publicly auditable at [github.com/zw008/VMware-AVI](https://github.com/zw008/VMware-AVI) under the MIT license.

AVI (NSX Advanced Load Balancer) application delivery and AKO Kubernetes operations — 28 MCP tools.

> **Dual mode**: Traditional AVI Controller management + AKO K8s operations in one skill.
> **Family**: [vmware-aiops](https://github.com/zw008/VMware-AIops) (VM lifecycle), [vmware-monitor](https://github.com/zw008/VMware-Monitor) (inventory/health), [vmware-storage](https://github.com/zw008/VMware-Storage) (iSCSI/vSAN), [vmware-vks](https://github.com/zw008/VMware-VKS) (Tanzu Kubernetes), [vmware-nsx](https://github.com/zw008/VMware-NSX) (NSX networking), [vmware-nsx-security](https://github.com/zw008/VMware-NSX-Security) (DFW/firewall), [vmware-aria](https://github.com/zw008/VMware-Aria) (metrics/alerts/capacity), [vmware-harden](https://github.com/zw008/VMware-Harden) (compliance baselines).
> | [vmware-pilot](../vmware-pilot/SKILL.md) (workflow orchestration) | [vmware-policy](../vmware-policy/SKILL.md) (audit/policy)

## What This Skill Does

| Category | Tools | Count |
|----------|-------|:-----:|
| **Virtual Service** | list, status, enable/disable | 3 |
| **Pool Member** | pool discovery, member list, enable/disable member (drain/restore traffic) | 4 |
| **SSL Certificate** | list, expiry check | 2 |
| **Analytics** | VS metrics overview, request error logs | 2 |
| **Service Engine** | list, health check | 2 |
| **AKO Pod Ops** | status, logs, restart, version info | 4 |
| **AKO Config** | values.yaml view, Helm diff, Helm upgrade | 3 |
| **Ingress Diagnostics** | annotation validation, VS mapping, error diagnosis, fix recommendation | 4 |
| **Sync Diagnostics** | K8s-Controller comparison, inconsistency list, force resync | 3 |
| **Multi-cluster** | cluster list, cross-cluster AKO overview, AMKO status | 3 |

## Quick Install

```bash
uv tool install vmware-avi
vmware-avi doctor            # checks Controller connectivity + kubeconfig + avisdk
```

## When to Use This Skill

- List, enable, or disable virtual services on AVI Controller
- Add, remove, drain, or restore pool members (maintenance windows, rolling deployments)
- Check SSL certificate expiry across all virtual services
- View VS analytics — throughput, latency, error rates, request logs
- Check service engine status (inventory-based) and per-SE VS placement counts
- Troubleshoot AKO pods — status, logs, restarts
- Manage AKO Helm configuration — view, diff, upgrade values.yaml
- Validate Ingress annotations and diagnose why a VS wasn't created as expected
- Detect sync drift between K8s resources and AVI Controller objects
- Get a cross-cluster view of AKO deployments and AMKO status

**Use companion skills for**:
- VM lifecycle, deployment, guest ops → `vmware-aiops`
- NSX segments, gateways, NAT → `vmware-nsx`
- DFW firewall rules, security groups → `vmware-nsx-security`
- K8s cluster lifecycle (Supervisor, TKC) → `vmware-vks`
- Read-only vSphere monitoring → `vmware-monitor`

## Related Skills — Skill Routing

| User Intent | Recommended Skill |
|-------------|------------------|
| Load balancer, VS, pool, AVI, ALB, AKO | **vmware-avi** ← this skill |
| VM lifecycle, deployment, guest ops | **vmware-aiops** (`uv tool install vmware-aiops`) |
| Read-only vSphere monitoring | **vmware-monitor** (`uv tool install vmware-monitor`) |
| Storage: iSCSI, vSAN, datastores | **vmware-storage** (`uv tool install vmware-storage`) |
| NSX networking: segments, gateways, NAT | **vmware-nsx** (`uv tool install vmware-nsx-mgmt`) |
| NSX security: DFW rules, security groups | **vmware-nsx-security** (`uv tool install vmware-nsx-security`) |
| Tanzu Kubernetes (Supervisor/TKC) | **vmware-vks** (`uv tool install vmware-vks`) |
| Aria Ops: metrics, alerts, capacity | **vmware-aria** (`uv tool install vmware-aria`) |
| Multi-step workflows with approval | **vmware-pilot** |
| Compliance baselines (CIS / 等保 / PCI-DSS), drift detection, LLM remediation advisor | **vmware-harden** (`uv tool install vmware-harden`) |
| Audit log query | **vmware-policy** (`vmware-audit` CLI) |

## Common Workflows

### Maintenance Window — Drain a Pool Member

**Pre-flight (judgment — affects live traffic)**:
- Capacity check: pool must have ≥ 2 healthy members. Disabling the only-other-healthy member is a self-DoS. Verify with `pool members my-pool` first.
- Connection persistence: if VS uses session persistence (cookie/source-IP), existing sessions stay pinned to the disabled member until they expire. "Drain" is not instant — 5-30 min depending on persistence TTL.
- Long-lived connections: WebSocket/streaming sessions can hold for hours. Decide upfront: hard-disconnect (faster, user-visible) or wait (slower, transparent).
- Observability: enable analytics on the VS BEFORE disabling — you need the baseline to detect degradation.

**Steps**:
1. `pool members my-pool` → confirm ≥ 2 healthy members and identify session persistence config
2. `pool disable my-pool <server-ip>` (graceful drain — new connections stop, existing finish)
3. `analytics my-vs --duration 15m` → watch active connection count to the drained member trend toward zero
4. Perform maintenance only after active connections = 0 (or you've decided to hard-disconnect)
5. `pool enable my-pool <server-ip>` → re-enable
6. **Verify** before declaring success: health monitor passes (typically 30-90 sec) AND new connections are landing on the member (analytics drill-down)

### AKO Ingress Not Creating VS

**Judgment**: this is a layered failure — figure out which layer broke before randomly probing. AKO is a controller; like all K8s controllers, the failure modes are: (a) controller down, (b) controller running but seeing wrong inputs, (c) controll