Skill98 repo starsupdated yesterday

neo4j-snowflake-graph-analytics-skill

This Neo4j Snowflake Native App enables users to run graph analytics algorithms such as PageRank, Louvain, Weakly Connected Components, Dijkstra, and KNN directly within Snowflake using SQL commands. Use this skill when your data already resides in Snowflake tables and you need to project it into a graph structure, execute algorithms, and write results back to Snowflake tables for on-demand analytics or pipeline workloads while keeping data isolated from your live database.

View source Repository: neo4j-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/neo4j-contrib/neo4j-skills /tmp/neo4j-snowflake-graph-analytics-skill && cp -r /tmp/neo4j-snowflake-graph-analytics-skill/neo4j-snowflake-graph-analytics-skill ~/.claude/skills/neo4j-snowflake-graph-analytics-skill

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

Snowflake Native App — graph algorithm power inside Snowflake. Data stays in Snowflake; project into a graph, run algorithms via SQL `CALL`, results written back to Snowflake tables.

**Docs:** https://neo4j.com/docs/snowflake-graph-analytics/current/

---

## When to Use
- Running graph algorithms / GDS in Snowflake
- Data already lives in Snowflake tables
- On-demand / pipeline workloads — ephemeral sessions, pay per session-minute
- Full isolation from the live database during analytics

## When NOT to Use
- **Aura Pro with embedded GDS plugin** → `neo4j-gds-skill`
- **Aura Graph Analytics** → `neo4j-aura-graph-analytics-skill`
- **Self-managed Neo4j with embedded GDS plugin** → `neo4j-gds-skill`
- **Writing Cypher queries** → `neo4j-cypher-skill`

---

## The End-to-End Flow

This is the flow that works. Don't jump straight to a `CALL` — most failures come from skipping the data-preparation step.

1. **Explore** the source data — inspect table DDLs to learn columns and types.
2. **Prepare projection views** — create node/relationship views that expose the required key columns and cast every property to a supported type (see the strict rules below). This is the step that matters most.
3. **Project → Compute → Write** — run the algorithm with a single `CALL`, assembling the `project`, `compute`, and `write` config.
4. **Inspect & look up names** — join numeric results back to the source table to get human-readable labels.

---

## Step 1 — Explore the Source Data

Look at the table definitions before designing the graph:

```sql
SELECT GET_DDL('TABLE', 'MY_DATABASE.MY_SCHEMA.MY_TABLE');
-- or inspect columns/types:
SELECT COLUMN_NAME, DATA_TYPE
FROM MY_DATABASE.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'MY_SCHEMA' AND TABLE_NAME = 'MY_TABLE';
```

Decide which tables are **nodes** and which represent **relationships** (edges) between them.

---

## Step 2 — Prepare Projection Views (the important part)

The graph engine is strict about column names and types. **Snowflake views inherit the source column type by default**, so you MUST add explicit `CAST`s — never `SELECT col` without one for a property column.

Create views that reshape your tables into the node/relationship format:

```sql
CREATE OR REPLACE VIEW MY_DATABASE.MY_SCHEMA.MY_NODES_VW AS
SELECT ... FROM MY_DATABASE.MY_SCHEMA.MY_TABLE;
```

### Node views

- **Key column:** expose the primary key as `NODEID`. It must be `BIGINT` or `STRING`. Always alias **and** cast explicitly:
  `SOURCE_COL::BIGINT AS NODEID` or `SOURCE_COL::STRING AS NODEID`.
- **Allowed node property types (exactly):** `BIGINT`, `DOUBLE`, `ARRAY`, `VECTOR(FLOAT, n)`. Anything else must be cast to one of these or dropped.
- **Composite keys:** concatenate parts with `'++'`.
- **Naming:** `<table>_NODES_VW`.

### Source-type → view-type casting rules

Apply these when projecting columns from your tables (keep the original column name unless renaming):

| Source type | Action |
|---|---|
| Whole-number numerics (`INT`, `INTEGER`, `BIGINT`, `SMALLINT`, `TINYINT`, `BYTEINT`, `NUMBER(p,0)`) | `CAST(col AS BIGINT) AS col` |
| Fractional numerics (`FLOAT`, `DOUBLE`, `REAL`, `DECIMAL(p,s>0)`, `NUMBER(p,s>0)`) | `CAST(col AS DOUBLE) AS col` |
| `ARRAY` of numbers | keep as `ARRAY` (except GraphSAGE — see below). Not allowed on relationship views. |
| `VECTOR(FLOAT, n)` | keep as-is. Not allowed on relationship views. |
| `BOOLEAN` | **drop by default**. Opt-in only: `IFF(col, 1, 0)::BIGINT AS col` |
| `DATE`, `TIME`, `TIMESTAMP*` | **drop by default**. Opt-in only: `DATE_PART('EPOCH_SECOND', col)::BIGINT AS col` (tell the user the unit) |
| `VARCHAR`, `CHAR`, `TEXT`, `STRING` | **drop** — can't be a graph property. To read results by name, join output back to the source table on the key (see Step 4) |
| `VARIANT`, `OBJECT`, `GEOGRAPHY`, `GEOMETRY`, `BINARY` | **drop** — not supported as graph properties |

**Lowest-common-denominator policy:** by default include only safe columns (numeric → BIGINT/DOUBLE, ARRAY, VECTOR). Booleans and time-like columns require explicit opt-in. When you drop columns, briefly tell the user which and why, so they can ask for them back.

### Relationship views

- **Key columns:** expose `SOURCENODEID` and `TARGETNODEID`, cast with the same rules as `NODEID`
  (`SOURCE_COL::BIGINT AS SOURCENODEID`, etc.). Every value must match an existing `NODEID` in a node view.
- **Allowed relationship property types (narrower):** `BIGINT`, `DOUBLE`, `INT` only. **No `ARRAY`, no `VECTOR`.** (The docs describe relationship properties as `FLOAT`; the engine accepts these whole/fractional numeric casts and treats them as weights — keep them numeric.)
- **Naming:** `<table>_RELATIONSHIPS_VW`.

Example node + relationship views:

```sql
CREATE OR REPLACE VIEW MY_DATABASE.MY_SCHEMA.USER_NODES_VW AS
SELECT user_id::BIGINT AS NODEID,
       CAST(age AS BIGINT)        AS age,
       CAST(balance AS DOUBLE)    AS balance
FROM MY_DATABASE.MY_SCHEMA.USERS;

CREATE OR REPLACE VIEW MY_DATABASE.MY_SCHEMA.TRANSFERS_RELATIONSHIPS_VW AS
SELECT from_user::BIGINT AS SOURCENODEID,
       to_user::BIGINT   AS TARGETNODEID,
       CAST(amount AS DOUBLE) AS amount
FROM MY_DATABASE.MY_SCHEMA.TRANSFERS;
```

> The required logical column names are `nodeId` / `sourceNodeId` / `targetNodeId` — Snowflake folds unquoted identifiers to uppercase, so `NODEID` etc. match. Casting explicitly is what matters.

---

## Step 3 — Project → Compute → Write

Every run is a single `CALL` whose first argument is the compute pool and second is a JSON config with three parts. Note JSON uses **single quotes** in Snowflake SQL.

> **App name:** `Neo4j_Graph_Analytics` is only the *default* installation name. If the app was installed under a different name, replace it everywhere — in the procedure call (`<APP>.graph.<algo>`), the `USE DATABASE <APP>` statement, and the privilege grants below. Check with `SHOW APPLICATIONS;`.

```sql
USE ROLE MY_CONSUMER_ROLE;

CALL Neo4j_Graph_Analytics.grap

More from this repository

neo4j-agent-memory-skillSkill

Authoritative reference for the neo4j-agent-memory Python package — a graph-native memory system for AI agents built on Neo4j — and for the hosted service (NAMS) at memory.neo4jlabs.com. Use this skill whenever the user mentions neo4j-agent-memory, agent memory with Neo4j, context graphs, the POLE+O model, MemoryClient/MemorySettings, the memory MCP server, or any of the framework integrations (LangChain, PydanticAI, CrewAI, AWS Strands, Google ADK, Microsoft Agent Framework, OpenAI Agents, LlamaIndex). Also use when the user mentions the hosted service at memory.neo4jlabs.com, NAMS, the Neo4j Agent Memory Service, the `nams_` API key prefix, or the hosted MCP endpoint. Also use when writing documentation, blog posts, tutorials, PRDs, or code samples for the project, when comparing agent memory approaches, or when positioning graph-native memory against vector-only approaches — even if the user doesn't explicitly name the package.

neo4j-aura-agent-skillSkill

Manages Neo4j Aura Agents via the v2beta1 REST API — create, list, get, update, delete,

neo4j-aura-graph-analytics-skillSkill

Serverless Aura Graph Analytics (AGA) GDS Sessions — covers GdsSessions,

neo4j-aura-provisioning-skillSkill

Provisions and manages Neo4j Aura instances via CLI (aura-cli v1.7+) or REST API.

neo4j-cli-tools-skillSkill

Use when working with Neo4j command-line tools — neo4j-cli (modern unified

neo4j-cypher-skillSkill

Generates, optimizes, and validates Cypher 25 queries for Neo4j 2025.x and 2026.x.

neo4j-document-import-skillSkill

Ingests unstructured and semi-structured documents into Neo4j as a knowledge graph.

neo4j-driver-dotnet-skillSkill

Neo4j .NET Driver v6 — IDriver lifecycle, DI registration (singleton), ExecutableQuery