neo4j-graphrag-skill
Build GraphRAG retrieval pipelines on Neo4j using the neo4j-graphrag Python
git clone --depth 1 https://github.com/neo4j-contrib/neo4j-skills /tmp/neo4j-graphrag-skill && cp -r /tmp/neo4j-graphrag-skill/neo4j-graphrag-skill ~/.claude/skills/neo4j-graphrag-skillSKILL.md
# Neo4j GraphRAG Skill
## When to Use
- Building GraphRAG retrieval pipelines with `neo4j-graphrag` Python package
- Choosing between VectorRetriever, HybridRetriever, VectorCypherRetriever, HybridCypherRetriever
- Writing `retrieval_query` Cypher fragments for graph-augmented context
- Wiring retriever + LLM into a `GraphRAG` pipeline
- Using LLM-routed multi-retriever with `ToolsRetriever`
- Debugging low retrieval quality
- Integrating Neo4j with LangChain, LlamaIndex, or Haystack
## When NOT to Use
- **KG construction from documents** → `neo4j-document-import-skill`
- **Plain vector/semantic search without graph traversal** → `neo4j-vector-index-skill`
- **Hybrid search that combines vector with fulltext or other ranked sources** → `neo4j-vector-index-skill`
- **GDS algorithms (PageRank, Louvain, node embeddings)** → `neo4j-gds-skill`
- **Agent long-term memory** → `neo4j-agent-memory-skill`
- **Writing raw Cypher queries** → `neo4j-cypher-skill`
---
## Retriever Selection
```
Has fulltext index?
YES → Hybrid variants (HybridRetriever / HybridCypherRetriever)
NO → Vector variants (VectorRetriever / VectorCypherRetriever)
Need graph traversal after vector lookup?
YES → Cypher variants (VectorCypherRetriever / HybridCypherRetriever)
NO → plain variants
Natural-language-to-Cypher? → Text2CypherRetriever (no embedder needed)
LLM should route between retrievers? → ToolsRetriever
Vectors stored in external DB? → WeaviateNeo4jRetriever / PineconeNeo4jRetriever / QdrantNeo4jRetriever
```
| Retriever | Vector | Fulltext | Graph | Best For |
|---|:---:|:---:|:---:|---|
| `VectorRetriever` | ✓ | — | — | Baseline semantic search |
| `HybridRetriever` | ✓ | ✓ | — | Better recall, no graph expansion |
| `VectorCypherRetriever` | ✓ | — | ✓ | GraphRAG without fulltext |
| `HybridCypherRetriever` | ✓ | ✓ | ✓ | **Production GraphRAG — default** |
| `Text2CypherRetriever` | — | — | ✓ | NL→Cypher, no embedder |
| `ToolsRetriever` | varies | varies | varies | LLM-routed multi-retriever |
| `WeaviateNeo4jRetriever` | ✓ | — | ✓ | Vectors in Weaviate |
| `PineconeNeo4jRetriever` | ✓ | — | ✓ | Vectors in Pinecone |
| `QdrantNeo4jRetriever` | ✓ | — | ✓ | Vectors in Qdrant |
---
## Install
```bash
pip install neo4j-graphrag[openai] # OpenAI LLM + embeddings
pip install neo4j-graphrag[anthropic] # Anthropic Claude
pip install neo4j-graphrag[google] # Vertex AI / Gemini
pip install neo4j-graphrag[bedrock] # Amazon Bedrock (boto3)
pip install neo4j-graphrag[cohere] # Cohere
pip install neo4j-graphrag[mistralai] # MistralAI
pip install neo4j-graphrag[ollama] # Ollama (local)
pip install neo4j-graphrag[weaviate] # Weaviate external retriever
pip install neo4j-graphrag[pinecone] # Pinecone external retriever
pip install neo4j-graphrag[qdrant] # Qdrant external retriever
```
Requires: Python >= 3.10, `neo4j >= 5.17.0` (driver 6.x supported).
---
## Step 2 — Choose Retriever
```
Has fulltext index? YES → Hybrid variants (better recall)
NO → Vector variants (baseline)
Needs graph context after vector lookup? YES → Cypher variants
NO → plain variants
For natural-language-to-Cypher? → Text2CypherRetriever (no embedder needed)
For multi-tool LLM routing? → ToolsRetriever
Using external vector DB? → WeaviateNeo4jRetriever / PineconeNeo4jRetriever / QdrantNeo4jRetriever
```
| Retriever | Vector | Fulltext | Graph | When to use |
|---|:---:|:---:|:---:|---|
| `VectorRetriever` | ✓ | — | — | Baseline; quick start |
| `HybridRetriever` | ✓ | ✓ | — | Better recall; no graph context |
| `VectorCypherRetriever` | ✓ | — | ✓ | GraphRAG without fulltext |
| `HybridCypherRetriever` | ✓ | ✓ | ✓ | **Production GraphRAG — default choice** |
| `Text2CypherRetriever` | — | — | ✓ | LLM generates Cypher; no embedder |
| `ToolsRetriever` | varies | varies | varies | Multi-retriever LLM routing |
For custom Cypher hybrid search outside the `neo4j-graphrag` retriever APIs, use `neo4j-vector-index-skill`.
**Vector backend selection [v1.16+, auto]**: on Neo4j 2026.01+ all four vector/hybrid retrievers auto-route through the Cypher 25 `SEARCH ... WHERE` clause when filters are SEARCH-compatible (simple AND comparisons) and all filter props are declared in the index `WITH [n.prop]` list. `$or`, `$in`, `$like`, or undeclared props → automatic fallback to `db.index.vector.queryNodes()` procedure path (with warning log). Declare filterable properties via `filterable_properties=[...]` on `create_vector_index()`.
---
## Step 3 — Create Indexes (run once)
```cypher
// Vector index (all retrievers need this)
CREATE VECTOR INDEX chunk_embedding IF NOT EXISTS
FOR (c:Chunk) ON (c.embedding)
OPTIONS { indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
} };
// Fulltext index (Hybrid retrievers only)
CREATE FULLTEXT INDEX chunk_fulltext IF NOT EXISTS
FOR (c:Chunk) ON EACH [c.text];
// Confirm ONLINE before ingesting:
SHOW INDEXES YIELD name, state
WHERE name IN ['chunk_embedding', 'chunk_fulltext']
RETURN name, state;
// Both must show state = 'ONLINE'
```
If index not ONLINE: wait, poll every 5s. Do NOT start ingestion until ONLINE.
---
## Step 4 — Core Pattern (HybridCypherRetriever)
```python
from neo4j import GraphDatabase
from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.retrievers import HybridCypherRetriever
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))
embedder = OpenAIEmbeddings(model="text-embedding-3-large") # OPENAI_API_KEY from env
# retrieval_query: Cypher fragment executed after the vector/fulltext lookup.
# Auto-injected variables: node (matched node) score (similarity float)
# MUST include a RETURN clause. score must appear in RETURN.
retrieval_query = """
MATCH (node)<-[:HASAuthoritative reference for the neo4j-agent-memory Python package — a graph-native memory system for AI agents built on Neo4j — and for the hosted service (NAMS) at memory.neo4jlabs.com. Use this skill whenever the user mentions neo4j-agent-memory, agent memory with Neo4j, context graphs, the POLE+O model, MemoryClient/MemorySettings, the memory MCP server, or any of the framework integrations (LangChain, PydanticAI, CrewAI, AWS Strands, Google ADK, Microsoft Agent Framework, OpenAI Agents, LlamaIndex). Also use when the user mentions the hosted service at memory.neo4jlabs.com, NAMS, the Neo4j Agent Memory Service, the `nams_` API key prefix, or the hosted MCP endpoint. Also use when writing documentation, blog posts, tutorials, PRDs, or code samples for the project, when comparing agent memory approaches, or when positioning graph-native memory against vector-only approaches — even if the user doesn't explicitly name the package.
Manages Neo4j Aura Agents via the v2beta1 REST API — create, list, get, update, delete,
Serverless Aura Graph Analytics (AGA) GDS Sessions — covers GdsSessions,
Provisions and manages Neo4j Aura instances via CLI (aura-cli v1.7+) or REST API.
Use when working with Neo4j command-line tools — neo4j-cli (modern unified
Generates, optimizes, and validates Cypher 25 queries for Neo4j 2025.x and 2026.x.
Ingests unstructured and semi-structured documents into Neo4j as a knowledge graph.
Neo4j .NET Driver v6 — IDriver lifecycle, DI registration (singleton), ExecutableQuery