Skip to main content
ClaudeWave
Skill66 repo starsupdated 29d ago

devops-deploy

Designs and executes CI/CD pipelines, GitOps workflows, deployment automation, and cloud infrastructure deployment including Docker, AWS Lambda, SAM, Terraform, and GitHub Actions. Use when building or improving CI/CD pipelines, containerizing applications, creating deployment runbooks, or deploying to cloud infrastructure.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/tranhieutt/software_development_department /tmp/devops-deploy && cp -r /tmp/devops-deploy/.claude/skills/devops-deploy ~/.claude/skills/devops-deploy
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# DevOps Deploy

## Production checklist (always verify)

- [ ] Env vars via Secrets Manager — never hardcoded
- [ ] Health check endpoint responding
- [ ] Structured JSON logs with `request_id`
- [ ] Rate limiting configured
- [ ] CORS restricted to authorized domains
- [ ] Lambda timeout appropriate (10–30s)
- [ ] CloudWatch alarms for errors and latency
- [ ] Rollback plan documented
- [ ] Load test before launch

## Docker: multi-stage Python

```dockerfile
FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
ENV PYTHONUNBUFFERED=1
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=3s CMD curl -f http://localhost:8000/health || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
```

## Docker Compose (local dev)

```yaml
services:
  app:
    build: .
    ports: ["8000:8000"]
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    volumes:
      - .:/app
    depends_on: [db, redis]
  db:
    image: postgres:15
    environment:
      POSTGRES_DB: app
      POSTGRES_USER: app
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data
  redis:
    image: redis:7-alpine
volumes:
  pgdata:
```

## SAM template (Lambda + DynamoDB)

```yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Globals:
  Function:
    Timeout: 30
    Runtime: python3.11
    Environment:
      Variables:
        DYNAMODB_TABLE: !Ref AppTable

Resources:
  AppFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: lambda_function.handler
      MemorySize: 512
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref AppTable

  AppTable:
    Type: AWS::DynamoDB::Table
    Properties:
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: userId
          AttributeType: S
      KeySchema:
        - AttributeName: userId
          KeyType: HASH
      TimeToLiveSpecification:
        AttributeName: ttl
        Enabled: true
```

```bash
# SAM commands
sam build
sam deploy --guided          # first time (creates samconfig.toml)
sam deploy                   # subsequent
sam deploy --no-confirm-changeset --no-fail-on-empty-changeset
sam logs -n AppFunction --tail
```

## GitHub Actions: test + security + deploy

```yaml
name: Deploy
on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: "3.11" }
      - run: pip install -r requirements.txt
      - run: pytest tests/ -v --cov=src --cov-report=xml

  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install bandit safety
      - run: bandit -r src/ -ll
      - run: safety check -r requirements.txt

  deploy:
    needs: [test, security]
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/setup-sam@v2
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
      - run: sam build
      - run: sam deploy --no-confirm-changeset
```

## Health check endpoint (FastAPI)

```python
import time, os
from fastapi import FastAPI

app = FastAPI()
START_TIME = time.time()

@app.get("/health")
async def health():
    return {
        "status": "healthy",
        "uptime_seconds": time.time() - START_TIME,
        "version": os.environ.get("APP_VERSION", "unknown"),
    }
```

## Pipeline Design

### Standard Pipeline Stages

```
[Build] -> [Test] -> [Security Scan] -> [Package] -> [Deploy Staging] -> [Integration Test] -> [Approval] -> [Deploy Prod] -> [Verify]
```

| Stage | Actions | Failure Policy |
|-------|---------|----------------|
| Build | Compile, lint, type-check | Block |
| Test | Unit + integration tests | Block |
| Security | SAST, dependency scan, container scan | Block on Critical/High |
| Package | Docker build, push to registry, sign image | Block |
| Deploy Staging | Apply manifests/Helm, run smoke tests | Block |
| Approval | Manual gate for production | Require approval |
| Deploy Prod | Progressive rollout | Auto-rollback on failure |
| Verify | Health checks, metrics validation | Auto-rollback |

### Deployment Strategy Selection

| Strategy | Zero-downtime | Rollback Speed | Resource Cost | Use When |
|----------|---------------|----------------|---------------|----------|
| Rolling Update | Yes | Slow (redeploy) | Low | Default for most services |
| Blue/Green | Yes | Instant (switch) | 2x | Critical services, DB-independent |
| Canary | Yes | Fast (shift) | 1.1x | High-traffic, need real-user validation |

### GitOps Repository Structure

```
app-repo/           # Application source code + Dockerfile
env-repo/           # Environment configs
  base/             # Base manifests
  overlays/
    dev/
    staging/
    prod/
```

Tools: ArgoCD or Flux v2 · Kustomize or Helm · External Secrets Operator

### Security Scanning in Pipeline

- SAST: CodeQL, Semgrep, SonarQube
- Dependency: Snyk, Dependabot, npm audit
- Container: Trivy, Grype
- Secrets: GitLeaks, TruffleHog
- SBOM: Syft · Image signing: Cosign

### DORA Metrics to Track

- Deployment frequency
- Lead time for changes
- Change failure rate
- Mean time to recovery (MTTR)

---

## Deployment Runbook Principles

### Platform Selection

```
What are you deploying?
├── Static site → Vercel, Netlify, Cloudflare Pages
├── Simple web app → Railway, Render, Fly.io / VPS + PM2
├── Microservices → Container orchestration
└── Serverless → Edge functions, Lambda
```

| Platform | Deployment Method | Rol
accessibility-specialistSubagent

The Accessibility Specialist ensures the software is accessible to the widest possible audience. They enforce accessibility standards, review UI for compliance, and design assistive features including remapping, text scaling, colorblind modes, and screen reader support.

ai-programmerSubagent

The AI Programmer implements intelligent system features: recommendation engines, classification pipelines, LLM integrations, decision logic, and autonomous agent behavior. Use this agent for AI/ML feature implementation, model integration, intelligent automation, or AI system debugging.

analytics-engineerSubagent

The Analytics Engineer designs telemetry systems, user behavior tracking, A/B test frameworks, and data analysis pipelines. Use this agent for event tracking design, dashboard specification, A/B test design, or user behavior analysis methodology.

backend-developerSubagent

The Backend Developer builds and maintains server-side logic, APIs, databases, authentication, and integrations. Use this agent for REST/GraphQL API implementation, database operations, authentication systems, background jobs, microservices, server performance, and backend testing. Works from API design contracts and PRDs.

community-managerSubagent

The Community Manager handles user-facing communications, feedback synthesis, support escalation, and community engagement. Use this agent for drafting release announcements, synthesizing user feedback into actionable insights, writing support documentation, or coordinating community-facing communication around releases and incidents.

ctoSubagent

The CTO (Chief Technical Officer) owns the high-level technical vision, architecture decisions, technology choices, and technical strategy. Use this agent for architecture-level decisions, technology evaluations, cross-system conflicts, and when a technical choice will constrain or enable product possibilities. This is the highest technical authority in the department.

data-engineerSubagent

The Data Engineer designs database schemas, builds data pipelines, manages migrations, and owns the data infrastructure. Use this agent for schema design, complex migrations, data modeling, ETL/ELT pipelines, database performance optimization, analytics infrastructure, and data integrity strategies.

devops-engineerSubagent

The DevOps Engineer maintains build pipelines, CI/CD configuration, version control workflow, and deployment infrastructure. Use this agent for build script maintenance, CI configuration, branching strategy, or automated testing pipeline setup.