Skill390 repo starsupdated 7mo ago

load-balancing-patterns

This skill guides selection and implementation of load balancing solutions across infrastructure layers, covering Layer 4 and Layer 7 routing approaches, health check configuration, session management strategies, and deployment patterns. Use it when distributing traffic across multiple servers, implementing high availability and failover mechanisms, routing based on application-level criteria like URLs or headers, managing session persistence, deploying to Kubernetes, or configuring multi-region traffic management and zero-downtime deployment strategies.

View source Repository: ai-design-components

Install in Claude Code

Copy

git clone --depth 1 https://github.com/ancoleman/ai-design-components /tmp/load-balancing-patterns && cp -r /tmp/load-balancing-patterns/skills/load-balancing-patterns ~/.claude/skills/load-balancing-patterns

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Load Balancing Patterns

Distribute traffic across infrastructure using the appropriate load balancing approach, from simple round-robin to global multi-region failover.

## When to Use This Skill

Use load-balancing-patterns when:
- Distributing traffic across multiple application servers
- Implementing high availability and failover
- Routing traffic based on URLs, headers, or geographic location
- Managing session persistence across stateless backends
- Deploying applications to Kubernetes clusters
- Configuring global traffic management across regions
- Implementing zero-downtime deployments (blue-green, canary)
- Selecting between cloud-managed and self-managed load balancers

## Core Load Balancing Concepts

### Layer 4 vs Layer 7

**Layer 4 (L4) - Transport Layer:**
- Routes based on IP address and port (TCP/UDP packets)
- No application data inspection, lower latency, higher throughput
- Protocol agnostic, preserves client IP addresses
- Use for: Database connections, video streaming, gaming, financial transactions, non-HTTP protocols

**Layer 7 (L7) - Application Layer:**
- Routes based on HTTP URLs, headers, cookies, request body
- Full application data visibility, SSL/TLS termination, caching, WAF integration
- Content-based routing capabilities
- Use for: Web applications, REST APIs, microservices, GraphQL endpoints, complex routing logic

For detailed comparison including performance benchmarks and hybrid approaches, see `references/l4-vs-l7-comparison.md`.

### Load Balancing Algorithms

| Algorithm | Distribution Method | Use Case |
|-----------|-------------------|----------|
| **Round Robin** | Sequential | Stateless, similar servers |
| **Weighted Round Robin** | Capacity-based | Different server specs |
| **Least Connections** | Fewest active connections | Long-lived connections |
| **Least Response Time** | Fastest server | Performance-sensitive |
| **IP Hash** | Client IP-based | Session persistence |
| **Resource-Based** | CPU/memory metrics | Varying workloads |

### Health Check Types

**Shallow (Liveness):** Is the process alive?
- Endpoint: `/health/live` or `/live`
- Returns: 200 if process running
- Use for: Process monitoring, container health

**Deep (Readiness):** Can the service handle requests?
- Endpoint: `/health/ready` or `/ready`
- Validates: Database, cache, external API connectivity
- Use for: Load balancer routing decisions

**Health Check Hysteresis:** Different thresholds for marking up vs down to prevent flapping
- Example: 3 failures to mark down, 2 successes to mark up

For complete health check implementation patterns, see `references/health-check-strategies.md`.

## Cloud Load Balancers

### AWS Load Balancing

**Application Load Balancer (ALB) - Layer 7:**
- Use for: HTTP/HTTPS applications, microservices, WebSocket
- Features: Path/host/header routing, AWS WAF integration, Lambda targets
- Choose when: Content-based routing needed

**Network Load Balancer (NLB) - Layer 4:**
- Use for: Ultra-low latency (<1ms), TCP/UDP, static IPs, millions RPS
- Features: Preserves source IP, TLS termination
- Choose when: Non-HTTP protocols, performance critical

**Global Accelerator - Layer 4 Global:**
- Use for: Multi-region applications, global users, DDoS protection
- Features: Anycast IPs, automatic regional failover

### GCP Load Balancing

**Application LB (L7):** Global HTTPS LB, Cloud CDN integration, Cloud Armor (WAF/DDoS)
**Network LB (L4):** Regional TCP/UDP, pass-through balancing, session affinity
**Cloud Load Balancing:** Single anycast IP, global distribution, backend buckets

### Azure Load Balancing

**Application Gateway (L7):** WAF integration, URL-based routing, SSL termination, autoscaling
**Load Balancer (L4):** Basic and Standard SKUs, health probes, HA ports
**Traffic Manager (Global):** DNS-based routing (priority, weighted, performance, geographic)

For complete cloud provider configurations and Terraform examples, see `references/cloud-load-balancers.md`.

## Self-Managed Load Balancers

### NGINX

**Best for:** General-purpose HTTP/HTTPS load balancing, web application stacks

**Capabilities:**
- HTTP reverse proxy with multiple algorithms
- TCP/UDP stream load balancing
- SSL/TLS termination
- Passive health checks (open source), active health checks (NGINX Plus)
- Cookie-based sticky sessions (NGINX Plus)

**Basic configuration:**
```nginx
upstream backend {
    least_conn;
    server backend1.example.com:8080 weight=3;
    server backend2.example.com:8080 weight=2;
    keepalive 32;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
```

For complete NGINX patterns and advanced configurations, see `references/nginx-patterns.md`.

### HAProxy

**Best for:** Maximum performance, database load balancing, resource efficiency

**Capabilities:**
- Highest raw throughput, lowest memory footprint
- 10+ load balancing algorithms
- Sophisticated health checks (HTTP, TCP, Redis, MySQL, etc.)
- Cookie or IP-based persistence

**Basic configuration:**
```haproxy
frontend http_front
    bind *:80
    default_backend web_servers

backend web_servers
    balance roundrobin
    option httpchk GET /health
    server web1 192.168.1.101:8080 check
    server web2 192.168.1.102:8080 check
```

For complete HAProxy patterns, see `references/haproxy-patterns.md`.

### Envoy

**Best for:** Microservices, Kubernetes, service mesh integration

**Capabilities:**
- Cloud-native design with dynamic configuration (xDS APIs)
- Circuit breakers, retries, timeouts
- Advanced health checks (TCP, HTTP, gRPC)
- Excellent observability

For complete Envoy patterns, see `references/envoy-patterns.md`.

### Traefik

**Best for:** Docker/Kubernetes environments, dynamic configuration, ease of use

**Capabilities:**
- Automatic service discovery
- Native Kubernetes integration
- Built-in Let's Encrypt support
- Middleware system (auth, rate