Skip to main content
ClaudeWave
Skill66 repo starsupdated 29d ago

design-system

Decomposes a product concept into architectural components, domain systems, data models, and integration boundaries. Use when starting system architecture or when the user mentions system design or component breakdown.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/tranhieutt/software_development_department /tmp/design-system && cp -r /tmp/design-system/.claude/skills/design-system ~/.claude/skills/design-system
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# System Design

## Phase 1: Clarify requirements (always do this first)

Ask before designing:
1. **Scale**: How many users/requests/day? Read-heavy or write-heavy?
2. **Consistency**: Strong (banking) or eventual (social feed)?
3. **Availability target**: 99.9% (8.7h/yr downtime) or 99.99% (52min/yr)?
4. **Latency budget**: p99 < 100ms? < 1s?
5. **Geography**: Single region or multi-region?

## Capacity estimation shortcuts

```
1M users/day active → ~12 req/s avg, ~120 req/s peak (10x)
1KB per request → 1M req/day = ~1GB/day = ~365GB/year
Read:write ratio 10:1 (typical social) → optimize read path first
1 server handles ~1000 req/s (rule of thumb for I/O-bound services)
```

## Component breakdown template

```
Client layer  → Web / Mobile / API consumers
CDN           → Static assets, edge caching
API Gateway   → Rate limiting, auth, routing, SSL termination
Services      → Domain-specific services (User, Order, Payment, Notification)
Cache         → Redis for hot data (sessions, rate limits, computed results)
Database      → Primary DB + Read replicas
Message queue → Async operations, event-driven decoupling
Storage       → Object storage for files (S3/GCS)
Monitoring    → Metrics, logs, traces, alerts
```

## Database selection guide

| Need | Choose |
|---|---|
| ACID transactions, relations | PostgreSQL |
| High-scale document store | MongoDB |
| Key-value, cache, pub/sub | Redis |
| Time-series data | TimescaleDB / InfluxDB |
| Graph relationships | Neo4j |
| Full-text search | Elasticsearch |
| Analytical/OLAP | ClickHouse / BigQuery |

## Caching strategies

```
Cache-aside (read):  App checks cache → miss → DB → write to cache
Write-through:        Write to cache AND DB simultaneously (consistent, slower writes)
Write-behind:         Write to cache → async flush to DB (fast writes, risk of loss)
Read-through:         Cache handles DB reads automatically

TTL guidelines:
- Sessions: 15-30 min
- User profile: 5 min
- Product catalog: 1 hour
- Config/settings: 24 hours
```

## Message queue patterns

```
When to use queues:
✓ Async processing (email, PDF generation, notifications)
✓ Rate-limiting downstream services
✓ Decoupling services (order → payment → shipping)
✓ Fan-out (1 event → multiple consumers)

Queue selection:
- RabbitMQ: complex routing, request-reply, low latency
- Kafka: high throughput, event log/replay, stream processing
- SQS: managed, simple, AWS-native, at-least-once delivery
- Redis Streams: lightweight, same infra as cache
```

## API design decisions

```
REST:    Standard CRUD, simple clients, team familiarity (default choice)
GraphQL: Multiple clients with different data needs, reduce over-fetching
gRPC:    Internal service-to-service, binary protocol, streaming needed
WebSocket: Real-time bidirectional (chat, live updates, collaborative tools)
```

## Scaling patterns

```
Vertical (scale up):   More CPU/RAM — quick, limited ceiling
Horizontal (scale out): More instances — requires stateless services
Database read replicas: Offload read traffic (good for 80%+ read workloads)
Database sharding:      Shard by user_id, geography — last resort, complex
CQRS:                   Separate read/write models — when read/write patterns diverge heavily
```

## Common design mistakes

| Mistake | Better approach |
|---|---|
| Over-engineering for scale you don't have | Start monolith, extract services at clear pain points |
| Synchronous calls to all dependencies | Use async queues for non-critical paths |
| No caching strategy | Cache at API layer + DB query results |
| Storing sessions in DB | Use Redis; DB sessions don't scale horizontally |
| Single point of failure | Redundancy at every critical layer |
accessibility-specialistSubagent

The Accessibility Specialist ensures the software is accessible to the widest possible audience. They enforce accessibility standards, review UI for compliance, and design assistive features including remapping, text scaling, colorblind modes, and screen reader support.

ai-programmerSubagent

The AI Programmer implements intelligent system features: recommendation engines, classification pipelines, LLM integrations, decision logic, and autonomous agent behavior. Use this agent for AI/ML feature implementation, model integration, intelligent automation, or AI system debugging.

analytics-engineerSubagent

The Analytics Engineer designs telemetry systems, user behavior tracking, A/B test frameworks, and data analysis pipelines. Use this agent for event tracking design, dashboard specification, A/B test design, or user behavior analysis methodology.

backend-developerSubagent

The Backend Developer builds and maintains server-side logic, APIs, databases, authentication, and integrations. Use this agent for REST/GraphQL API implementation, database operations, authentication systems, background jobs, microservices, server performance, and backend testing. Works from API design contracts and PRDs.

community-managerSubagent

The Community Manager handles user-facing communications, feedback synthesis, support escalation, and community engagement. Use this agent for drafting release announcements, synthesizing user feedback into actionable insights, writing support documentation, or coordinating community-facing communication around releases and incidents.

ctoSubagent

The CTO (Chief Technical Officer) owns the high-level technical vision, architecture decisions, technology choices, and technical strategy. Use this agent for architecture-level decisions, technology evaluations, cross-system conflicts, and when a technical choice will constrain or enable product possibilities. This is the highest technical authority in the department.

data-engineerSubagent

The Data Engineer designs database schemas, builds data pipelines, manages migrations, and owns the data infrastructure. Use this agent for schema design, complex migrations, data modeling, ETL/ELT pipelines, database performance optimization, analytics infrastructure, and data integrity strategies.

devops-engineerSubagent

The DevOps Engineer maintains build pipelines, CI/CD configuration, version control workflow, and deployment infrastructure. Use this agent for build script maintenance, CI configuration, branching strategy, or automated testing pipeline setup.