Skip to main content
ClaudeWave
Skill374 estrellas del repoactualizado 6mo ago

using-graph-databases

This Claude Code skill provides guidance on selecting and implementing graph databases for relationship-heavy applications. It covers Neo4j, ArangoDB, Amazon Neptune, and Cypher query patterns, helping developers determine when graph databases are appropriate versus relational or columnar alternatives, and offering practical implementation patterns for social networks, recommendation engines, knowledge graphs, and fraud detection systems.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/ancoleman/ai-design-components /tmp/using-graph-databases && cp -r /tmp/using-graph-databases/skills/using-graph-databases ~/.claude/skills/using-graph-databases
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Graph Databases

## Purpose

This skill guides selection and implementation of graph databases for applications where relationships between entities are first-class citizens. Unlike relational databases that model relationships through foreign keys and joins, graph databases natively represent connections as properties, enabling efficient traversal-heavy queries.

## When to Use This Skill

Use graph databases when:
- **Deep relationship traversals** (4+ hops): "Friends of friends of friends"
- **Variable/evolving relationships**: Schema changes don't break existing queries
- **Path finding**: Shortest route, network analysis, dependency chains
- **Pattern matching**: Fraud detection, recommendation engines, access control

**Do NOT use graph databases when**:
- Fixed schema with shallow joins (2-3 tables) → Use PostgreSQL
- Primarily aggregations/analytics → Use columnar databases
- Key-value lookups only → Use Redis/DynamoDB

## Quick Decision Framework

```
DATA CHARACTERISTICS?
├── Fixed schema, shallow joins (≤3 hops)
│   └─ PostgreSQL (relational)
│
├── Already on PostgreSQL + simple graphs
│   └─ Apache AGE (PostgreSQL extension)
│
├── Deep traversals (4+ hops) + general purpose
│   └─ Neo4j (battle-tested, largest ecosystem)
│
├── Multi-model (documents + graph)
│   └─ ArangoDB
│
├── AWS-native, serverless
│   └─ Amazon Neptune
│
└── Real-time streaming, in-memory
    └─ Memgraph
```

## Core Concepts

### Property Graph Model

Graph databases store data as:
- **Nodes** (vertices): Entities with labels and properties
- **Relationships** (edges): Typed connections with properties
- **Properties**: Key-value pairs on nodes and relationships

```
(Person {name: "Alice", age: 28})-[:FRIEND {since: "2020-01-15"}]->(Person {name: "Bob"})
```

### Query Languages

| Language | Databases | Readability | Best For |
|----------|-----------|-------------|----------|
| **Cypher** | Neo4j, Memgraph, AGE | ⭐⭐⭐⭐⭐ SQL-like | General purpose |
| **Gremlin** | Neptune, JanusGraph | ⭐⭐⭐ Functional | Cross-database |
| **AQL** | ArangoDB | ⭐⭐⭐⭐ SQL-like | Multi-model |
| **SPARQL** | Neptune, RDF stores | ⭐⭐⭐ W3C standard | Semantic web |

## Common Cypher Patterns

Reference `references/cypher-patterns.md` for comprehensive examples.

### Pattern 1: Basic Matching
```cypher
// Find all users at a company
MATCH (u:User)-[:WORKS_AT]->(c:Company {name: 'Acme Corp'})
RETURN u.name, u.title
```

### Pattern 2: Variable-Length Paths
```cypher
// Find friends up to 3 degrees away
MATCH (u:User {name: 'Alice'})-[:FRIEND*1..3]->(friend)
WHERE u <> friend
RETURN DISTINCT friend.name
LIMIT 100
```

### Pattern 3: Shortest Path
```cypher
// Find shortest connection between two users
MATCH path = shortestPath(
  (a:User {name: 'Alice'})-[*]-(b:User {name: 'Bob'})
)
RETURN path, length(path) AS distance
```

### Pattern 4: Recommendations
```cypher
// Collaborative filtering: Products liked by similar users
MATCH (u:User {id: $userId})-[:PURCHASED]->(p:Product)<-[:PURCHASED]-(similar)
MATCH (similar)-[:PURCHASED]->(rec:Product)
WHERE NOT exists((u)-[:PURCHASED]->(rec))
RETURN rec.name, count(*) AS score
ORDER BY score DESC
LIMIT 10
```

### Pattern 5: Fraud Detection
```cypher
// Detect circular money flows
MATCH path = (a:Account)-[:SENT*3..6]->(a)
WHERE all(r IN relationships(path) WHERE r.amount > 1000)
RETURN path, [r IN relationships(path) | r.amount] AS amounts
```

## Database Selection Guide

### Neo4j (Primary Recommendation)

**Use for**: General-purpose graph applications

**Strengths**:
- Most mature (2007), largest community (2M+ developers)
- 65+ graph algorithms (GDS library): PageRank, Louvain, Dijkstra
- Best tooling: Neo4j Browser, Bloom visualization
- Comprehensive Cypher support

**Installation**:
```bash
# Python driver
pip install neo4j

# TypeScript driver
npm install neo4j-driver

# Rust driver
cargo add neo4rs
```

Reference: `references/neo4j.md`

### ArangoDB

**Use for**: Multi-model applications (documents + graph)

**Strengths**:
- Store documents AND graph in one database
- AQL combines document and graph queries
- Schema flexibility with relationships

Reference: `references/arangodb.md`

### Apache AGE

**Use for**: Adding graph capabilities to existing PostgreSQL

**Strengths**:
- Extend PostgreSQL with graph queries
- No new infrastructure needed
- Query both relational and graph data

Reference: Implementation details in examples/

### Amazon Neptune

**Use for**: AWS-native, serverless deployments

**Strengths**:
- Fully managed, auto-scaling
- Supports Gremlin AND SPARQL
- AWS ecosystem integration

## Graph Data Modeling Patterns

Reference `references/graph-modeling.md` for comprehensive patterns.

### Best Practice 1: Relationships as First-Class Citizens

**Anti-pattern** (storing relationships in node properties):
```cypher
// BAD
(:Person {name: 'Alice', friend_ids: ['b123', 'c456']})
```

**Pattern** (explicit relationships):
```cypher
// GOOD
(:Person {name: 'Alice'})-[:FRIEND]->(:Person {id: 'b123'})
(:Person {name: 'Alice'})-[:FRIEND]->(:Person {id: 'c456'})
```

### Best Practice 2: Relationship Properties for Metadata

```cypher
// Track interaction details on relationships
(:Person)-[:FRIEND {
  since: '2020-01-15',
  strength: 0.85,
  last_interaction: datetime()
}]->(:Person)
```

### Best Practice 3: Bounded Traversals for Performance

```cypher
// SLOW: Unbounded traversal
MATCH (a)-[:FRIEND*]->(distant)
RETURN distant

// FAST: Bounded depth with index
MATCH (a)-[:FRIEND*1..4]->(distant)
WHERE distant.active = true
RETURN distant
LIMIT 100
```

### Best Practice 4: Avoid Supernodes

**Problem**: Nodes with thousands of relationships slow traversals.

**Solution**: Intermediate aggregation nodes
```cypher
// Instead of: (:User)-[:POSTED]->(:Post) [1M relationships]

// Use time partitioning:
(:User)-[:POSTED_IN]->(:Year {year: 2025})
       -[:HAS_MONTH]->(:Month {month: 12})
       -[:HAS_POST]->(:Post)
```

## Use Case Examples

### Social Network

Sch
administering-linuxSkill

Manage Linux systems covering systemd services, process management, filesystems, networking, performance tuning, and troubleshooting. Use when deploying applications, optimizing server performance, diagnosing production issues, or managing users and security on Linux servers.

ai-data-engineeringSkill

Data pipelines, feature stores, and embedding generation for AI/ML systems. Use when building RAG pipelines, ML feature serving, or data transformations. Covers feature stores (Feast, Tecton), embedding pipelines, chunking strategies, orchestration (Dagster, Prefect, Airflow), dbt transformations, data versioning (LakeFS), and experiment tracking (MLflow, W&B).

architecting-dataSkill

Strategic guidance for designing modern data platforms, covering storage paradigms (data lake, warehouse, lakehouse), modeling approaches (dimensional, normalized, data vault, wide tables), data mesh principles, and medallion architecture patterns. Use when architecting data platforms, choosing between centralized vs decentralized patterns, selecting table formats (Iceberg, Delta Lake), or designing data governance frameworks.

architecting-networksSkill

Design cloud network architectures with VPC patterns, subnet strategies, zero trust principles, and hybrid connectivity. Use when planning VPC topology, implementing multi-cloud networking, or establishing secure network segmentation for cloud workloads.

architecting-securitySkill

Design comprehensive security architectures using defense-in-depth, zero trust principles, threat modeling (STRIDE, PASTA), and control frameworks (NIST CSF, CIS Controls, ISO 27001). Use when designing security for new systems, auditing existing architectures, or establishing security governance programs.

assembling-componentsSkill

Assembles component outputs from AI Design Components skills into unified, production-ready component systems with validated token integration, proper import chains, and framework-specific scaffolding. Use as the capstone skill after running theming, layout, dashboard, data-viz, or feedback skills to wire components into working React/Next.js, Python, or Rust projects.

building-ai-chatSkill

Builds AI chat interfaces and conversational UI with streaming responses, context management, and multi-modal support. Use when creating ChatGPT-style interfaces, AI assistants, code copilots, or conversational agents. Handles streaming text, token limits, regeneration, feedback loops, tool usage visualization, and AI-specific error patterns. Provides battle-tested components from leading AI products with accessibility and performance built in.

building-ci-pipelinesSkill

Constructs secure, efficient CI/CD pipelines with supply chain security (SLSA), monorepo optimization, caching strategies, and parallelization patterns for GitHub Actions, GitLab CI, and Argo Workflows. Use when setting up automated testing, building, or deployment workflows.