Skill98 estrellas del repoactualizado yesterday

neo4j-query-tuning-skill

This skill diagnoses and optimizes slow Neo4j Cypher queries by analyzing EXPLAIN and PROFILE execution plans to identify bottlenecks such as missing indexes, poor cardinality estimates, and inefficient operators. Use it when a query performs unexpectedly slowly and you have access to its execution plan output, need to choose between runtime strategies, or want to understand why estimated rows diverge significantly from actual results.

Ver fuente Repositorio: neo4j-skills

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/neo4j-contrib/neo4j-skills /tmp/neo4j-query-tuning-skill && cp -r /tmp/neo4j-query-tuning-skill/neo4j-query-tuning-skill ~/.claude/skills/neo4j-query-tuning-skill

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

## When to Use
- Query takes unexpectedly long; need root-cause analysis
- EXPLAIN/PROFILE output in hand — needs interpretation
- Identifying which index is missing or unused
- Deciding between slotted / pipelined / parallel runtimes
- Monitoring live queries: SHOW QUERIES, SHOW TRANSACTIONS
- Cardinality estimates wrong (plan replanning needed)

## When NOT to Use
- **Writing Cypher from scratch** → `neo4j-cypher-skill`
- **GDS algorithm performance** → `neo4j-gds-skill`
- **Schema design / data modelling** → `neo4j-modeling-skill`

---

## EXPLAIN vs PROFILE

| | EXPLAIN | PROFILE |
|---|---|---|
| Executes query? | No | Yes |
| Returns data? | No | Yes |
| Shows `rows` (actual) | No | Yes |
| Shows `dbHits` (actual) | No | Yes |
| Shows `estimatedRows` | Yes | Yes |
| Cost | Zero | Full query cost |

Run `PROFILE` **twice** — first run warms page cache; second gives representative metrics.

```cypher
EXPLAIN MATCH (p:Person {email: $email}) RETURN p.name
PROFILE MATCH (p:Person {email: $email}) RETURN p.name
```

Query API alternative (no driver):
```bash
curl -X POST https://<host>/db/<db>/query/v2 \
  -u <user>:<pass> -H "Content-Type: application/json" \
  -d '{"statement": "EXPLAIN MATCH (p:Person {email: $email}) RETURN p.name", "parameters": {"email": "a@b.com"}}'
```

---

## Key Plan Metrics

| Metric | Good | Investigate if |
|---|---|---|
| `dbHits` | Low; drops after index added | High relative to `rows` |
| `rows` | Shrinks early in plan | Large until final operator |
| `estimatedRows` | Close to `rows` | >10× divergence from actual |
| `pageCacheHitRatio` | >0.99 | <0.90 (disk I/O bottleneck) |
| `pageCacheHits` | High | — |
| `pageCacheMisses` | Near 0 | Rising (page cache too small) |

Read plans **bottom-up** — leaf operators at bottom initiate data retrieval.

---

## Operator Reference

| Operator | Good/Bad | Meaning | Fix |
|---|---|---|---|
| `NodeIndexSeek` | ✓ | Exact match via RANGE/LOOKUP index | — |
| `NodeUniqueIndexSeek` | ✓ | Unique constraint index hit | — |
| `NodeIndexContainsScan` | ✓ | TEXT index CONTAINS / STARTS WITH | — |
| `NodeIndexScan` | ~ | Full index scan (no predicate) | Add WHERE predicate or composite index |
| `NodeByLabelScan` | ✗ | Scans all nodes of label | Add RANGE index on lookup property |
| `AllNodesScan` | ✗✗ | Scans entire node store | Add label + index to MATCH |
| `Expand(All)` | ~ | Traverse relationships from node | Normal; limit with LIMIT or WHERE |
| `Expand(Into)` | ~ | Find rels between two matched nodes | Normal for known-endpoint joins |
| `Filter` | ~ | Predicate applied after scan | Move predicate into WHERE with index |
| `CartesianProduct` | ✗ | No join predicate between two MATCH | Add WHERE join or use WITH between MATCHes |
| `NodeHashJoin` | ~ | Hash join on node IDs | Normal; planner chose hash join |
| `ValueHashJoin` | ~ | Hash join on values | Normal; watch memory for large inputs |
| `EagerAggregation` | ~ | Full aggregation (ORDER BY, count(*)) | Normal for aggregates |
| `Aggregation` | ✓ | Streaming aggregation | — |
| `Eager` | ✗ | Read/write conflict; materialises all rows | See Eager fix strategies below |
| `Sort` | ~ | Full sort — O(n log n) | Add `LIMIT` before Sort; push LIMIT earlier |
| `Top` | ✓ | Sort+Limit combined — O(n log k) | Preferred over Sort+Limit |
| `Limit` | ✓ | Truncates rows early | Push as early as possible |
| `Skip` | ~ | Offset pagination | Use keyset pagination on large graphs |
| `ProduceResults` | — | Final output operator | Root of tree |
| `UndirectedRelationshipByIdSeekPipe` | ~ | Lookup by relationship ID | Avoid `id(r)` — use `elementId(r)` |

Full operator reference → [references/plan-operators.md](references/plan-operators.md)

---

## Diagnostic Workflow (Agent Runbook)

### Step 1 — Baseline Plan
```cypher
EXPLAIN <query>
```
Scan output for `AllNodesScan`, `NodeByLabelScan`, `CartesianProduct`, `Eager`.

### Step 2 — Check Indexes
```cypher
SHOW INDEXES YIELD name, type, labelsOrTypes, properties, state
WHERE state = 'ONLINE'
```
Find whether the label/property from the bad operator has an index.

### Step 3 — Create Missing Index
```cypher
// RANGE index for equality/range predicates:
CREATE INDEX person_email IF NOT EXISTS FOR (n:Person) ON (n.email)
// TEXT index for CONTAINS/ENDS WITH:
CREATE TEXT INDEX person_bio IF NOT EXISTS FOR (n:Person) ON (n.bio)
// Composite for multi-property lookup:
CREATE INDEX order_status_date IF NOT EXISTS FOR (n:Order) ON (n.status, n.createdAt)
```
Wait for `state = 'ONLINE'` before measuring.

### Step 4 — Profile After Fix
```cypher
PROFILE <query>
```
Compare `dbHits` and elapsed ms before/after. Target: `NodeIndexSeek` replaces scan operators.

### Step 5 — Stale Statistics (if estimatedRows wildly off)
```cypher
CALL db.prepareForReplanning()
// or resample a specific index:
CALL db.resampleIndex("person_email")
// or resample all outdated:
CALL db.resampleOutdatedIndexes()
```
Config: `dbms.cypher.statistics_divergence_threshold` (default `0.75` — plan expires when stat changes >75%).

---

## Fixing Common Plan Problems

### Missing Index → NodeByLabelScan / AllNodesScan
```cypher
// Force index hint when planner ignores it:
MATCH (p:Person {email: $email})
USING INDEX p:Person(email)
RETURN p.name
// Force label scan (sometimes faster for high selectivity):
MATCH (p:Person {email: $email})
USING SCAN p:Person
RETURN p.name
```

### Wrong Anchor — Planner Picks Wrong Starting Node
Reorder MATCH or use hints:
```cypher
// Force join at specific node:
MATCH (a:Author)-[:WROTE]->(b:Book)-[:IN_CATEGORY]->(c:Category {name: $cat})
USING JOIN ON b
RETURN a.name, b.title
```

### CartesianProduct — Two Unconnected MATCHes
```cypher
// Bad (Cartesian product):
MATCH (a:Author {id: $aid})
MATCH (b:Book  {id: $bid})
RETURN a.name, b.title

// Good (explicit join or WITH):
MATCH (a:Author {id: $aid})-[:WROTE]->(b:Book {id: $bid})
RETURN a.name, b.title
// Or: WITH between them to reset planning context
```

### Eager — Read/Wr

Del mismo repositorio

neo4j-agent-memory-skillSkill

Authoritative reference for the neo4j-agent-memory Python package — a graph-native memory system for AI agents built on Neo4j — and for the hosted service (NAMS) at memory.neo4jlabs.com. Use this skill whenever the user mentions neo4j-agent-memory, agent memory with Neo4j, context graphs, the POLE+O model, MemoryClient/MemorySettings, the memory MCP server, or any of the framework integrations (LangChain, PydanticAI, CrewAI, AWS Strands, Google ADK, Microsoft Agent Framework, OpenAI Agents, LlamaIndex). Also use when the user mentions the hosted service at memory.neo4jlabs.com, NAMS, the Neo4j Agent Memory Service, the `nams_` API key prefix, or the hosted MCP endpoint. Also use when writing documentation, blog posts, tutorials, PRDs, or code samples for the project, when comparing agent memory approaches, or when positioning graph-native memory against vector-only approaches — even if the user doesn't explicitly name the package.

neo4j-aura-agent-skillSkill

Manages Neo4j Aura Agents via the v2beta1 REST API — create, list, get, update, delete,

neo4j-aura-graph-analytics-skillSkill

Serverless Aura Graph Analytics (AGA) GDS Sessions — covers GdsSessions,

neo4j-aura-provisioning-skillSkill

Provisions and manages Neo4j Aura instances via CLI (aura-cli v1.7+) or REST API.

neo4j-cli-tools-skillSkill

Use when working with Neo4j command-line tools — neo4j-cli (modern unified

neo4j-cypher-skillSkill

Generates, optimizes, and validates Cypher 25 queries for Neo4j 2025.x and 2026.x.

neo4j-document-import-skillSkill

Ingests unstructured and semi-structured documents into Neo4j as a knowledge graph.

neo4j-driver-dotnet-skillSkill

Neo4j .NET Driver v6 — IDriver lifecycle, DI registration (singleton), ExecutableQuery