ecs
# ClaudeWave Editor Note This AWS ECS skill provides container orchestration capabilities for deploying and managing Docker containers on AWS through Fargate (serverless) or EC2 instances. Use it when creating ECS clusters, registering task definitions, launching services with desired task counts, configuring container networking and IAM roles, or diagnosing container deployment and scaling issues.
git clone --depth 1 https://github.com/itsmostafa/aws-agent-skills /tmp/ecs && cp -r /tmp/ecs/skills/ecs ~/.claude/skills/ecsSKILL.md
# AWS ECS
Amazon Elastic Container Service (ECS) is a fully managed container orchestration service. Run containers on AWS Fargate (serverless) or EC2 instances.
## Table of Contents
- [Core Concepts](#core-concepts)
- [Common Patterns](#common-patterns)
- [CLI Reference](#cli-reference)
- [Best Practices](#best-practices)
- [Troubleshooting](#troubleshooting)
- [References](#references)
## Core Concepts
### Cluster
Logical grouping of tasks or services. Can contain Fargate tasks, EC2 instances, or both.
### Task Definition
Blueprint for your application. Defines containers, resources, networking, and IAM roles.
### Task
Running instance of a task definition. Can run standalone or as part of a service.
### Service
Maintains desired count of tasks. Handles deployments, load balancing, and auto scaling.
### Launch Types
| Type | Description | Use Case |
|------|-------------|----------|
| **Fargate** | Serverless, pay per task | Most workloads |
| **EC2** | Self-managed instances | GPU, Windows, specific requirements |
## Common Patterns
### Create a Fargate Cluster
**AWS CLI:**
```bash
# Create cluster
aws ecs create-cluster --cluster-name my-cluster
# With capacity providers
aws ecs create-cluster \
--cluster-name my-cluster \
--capacity-providers FARGATE FARGATE_SPOT \
--default-capacity-provider-strategy \
capacityProvider=FARGATE,weight=1 \
capacityProvider=FARGATE_SPOT,weight=1
```
### Register Task Definition
```bash
cat > task-definition.json << 'EOF'
{
"family": "web-app",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
"containerDefinitions": [
{
"name": "web",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
}
],
"environment": [
{"name": "NODE_ENV", "value": "production"}
],
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-password"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/web-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs",
"mode": "non-blocking",
"max-buffer-size": "25m"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
}
]
}
EOF
aws ecs register-task-definition --cli-input-json file://task-definition.json
```
### Create Service with Load Balancer
```bash
aws ecs create-service \
--cluster my-cluster \
--service-name web-service \
--task-definition web-app:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-12345678,subnet-87654321],
securityGroups=[sg-12345678],
assignPublicIp=DISABLED
}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/web-tg/1234567890123456,containerName=web,containerPort=8080" \
--health-check-grace-period-seconds 60 \
--deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true}"
```
### Run Standalone Task
```bash
aws ecs run-task \
--cluster my-cluster \
--task-definition my-batch-job:1 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-12345678],
securityGroups=[sg-12345678],
assignPublicIp=ENABLED
}"
```
### Update Service (Deploy New Image)
```bash
# Register new task definition with updated image
aws ecs register-task-definition --cli-input-json file://task-definition.json
# Update service to use new version
aws ecs update-service \
--cluster my-cluster \
--service web-service \
--task-definition web-app:2 \
--force-new-deployment
```
### Fargate Spot with SQS-Based Scaling
Use `FARGATE_SPOT` for batch/queue workloads to cut costs ~70%. Always include a fallback to regular `FARGATE`.
```bash
# Create service with Spot + fallback
aws ecs create-service \
--cluster batch-cluster \
--service-name queue-processor \
--task-definition my-processor:1 \
--desired-count 0 \
--capacity-provider-strategy \
capacityProvider=FARGATE_SPOT,weight=4,base=0 \
capacityProvider=FARGATE,weight=1,base=1 \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-12345678],
securityGroups=[sg-12345678],
assignPublicIp=DISABLED
}"
# Register scalable target (scale to zero when queue empty)
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/batch-cluster/queue-processor \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 0 \
--max-capacity 20
# Scale-out alarm: messages > 100
aws cloudwatch put-metric-alarm \
--alarm-name queue-scale-out \
--metric-name ApproximateNumberOfMessagesVisible \
--namespace AWS/SQS \
--dimensions Name=QueueName,Value=my-queue \
--statistic Average \
--period 60 \
--evaluation-periods 1 \
--threshold 100 \
--comparison-operator GreaterThanThreshold \
--alarm-actions <scale-out-policy-arn>
# Scale-in alarm: queue empty for 3 periods (conservative to avoid flapping)
aws cloudwatch put-metric-alarm \
--alarm-name queue-scale-in \
--metric-name ApproximateNumberOfMessagesVisible \
--namespace AWS/SQS \
--dimensions Name=QueueName,Value=my-queue \
--statistic Average \
--period 60 \
--evaluation-periods 3 \
--threshold 0 \
--comparison-operator LessThanOrEqualToThreshold \
--alarm-actions <scale-in-policy-arn>
```
**Fargate Spot interruption haAWS API Gateway for REST and HTTP API management. Use when creating APIs, configuring integrations, setting up authorization, managing stages, implementing rate limiting, or troubleshooting API issues.
AWS Bedrock foundation models for generative AI. Use when invoking foundation models, building AI applications, creating embeddings, configuring model access, or implementing RAG patterns.
AWS CloudFormation infrastructure as code for stack management. Use when writing templates, deploying stacks, managing drift, troubleshooting deployments, or organizing infrastructure with nested stacks.
AWS CloudWatch monitoring for logs, metrics, alarms, and dashboards. Use when setting up monitoring, creating alarms, querying logs with Insights, configuring metric filters, building dashboards, or troubleshooting application issues.
AWS Cognito user authentication and authorization service. Use when setting up user pools, configuring identity pools, implementing OAuth flows, managing user attributes, or integrating with social identity providers.
AWS DynamoDB NoSQL database for scalable data storage. Use when designing table schemas, writing queries, configuring indexes, managing capacity, implementing single-table design, or troubleshooting performance issues.
>
AWS EKS Kubernetes management for clusters, node groups, and workloads. Use when creating clusters, configuring IRSA, managing node groups, deploying applications, or integrating with AWS services.