Skill10.9k repo starsupdated today

phoenix-graphql

Phoenix GraphQL is a skill for querying Arize's Phoenix observability platform using GraphQL APIs in two contexts: internally analyzing Phoenix data to answer questions, or helping users write GraphQL queries for their own applications. Use it when you need to retrieve or filter datasets, projects, spans, traces, experiments, prompts, sessions, or annotations from Phoenix, with support for pagination, comparison operations, and both direct entity lookups and connection-based searches.

View source Repository: phoenix

Install in Claude Code

Copy

git clone --depth 1 https://github.com/Arize-ai/phoenix /tmp/phoenix-graphql && cp -r /tmp/phoenix-graphql/src/phoenix/server/agents/prompts/skills/phoenix-graphql ~/.claude/skills/phoenix-graphql

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

### Two modes

- **Internal data analysis** — you are querying Phoenix yourself to answer a question. Apply the schema facts, efficiency rules, and patterns below directly.
- **Helping the user integrate** — the user wants GraphQL queries for their own code or tools. Use the same schema facts and patterns, plus the "External API usage" section for endpoint, auth, and client examples. Queries you hand to the user should use variables and include pagination handling.

### Entrypoints

Top-level `Query` entrypoints get you to a starting entity; per-entity schema details live in the resources listed under "Schema map" below.

- `node(id: ID!)` — global lookup for **any** entity by its Relay global id; resolve with an inline fragment, e.g. `node(id: $id) { ... on Dataset { name } }`. This is the primary way to fetch datasets, prompts, experiments, sessions, and annotations, which have **no** by-name/by-id helpers.
- `projects(...)`, `datasets(...)`, `prompts(...)`, `evaluators(...)` → Relay connections, each with `filter`/`sort` inputs to find an entity when you only have a name.
- By-X helpers (the only ones that exist): `getProjectByName(name: String!)`, `getProjectSessionById(sessionId: String!)`, `getDatasetExampleByExternalId(datasetId: GlobalID!, externalId: String!)`, `getSpanByOtelId(spanId: String!)`, `getTraceByOtelId(traceId: String!)`. There is **no** `getDatasetByName`, `getPromptByName`, or `getExperimentById` — use `node(id:)` or a connection `filter` instead.
- `viewer` → the authenticated `User`; `projectCount`, `datasetCount`, `promptCount` — cheap counts.
- `compareExperiments(baseExperimentId: GlobalID!, compareExperimentIds: [GlobalID!]!, first, after, filterCondition)` → experiment comparison.

### Schema map

Per-entity field references and examples are split into resources. Read **only** the one(s) you need with `read_skill_resource`, after loading this skill:

- `project-spans-traces` — Project aggregates and `spans`; Span and Trace fields. The starting point for most trace analysis.
- `sessions` — ProjectSession: multi-turn session metrics, token/cost, session traces.
- `datasets` — Dataset and DatasetExample: examples, versions, splits, labels.
- `experiments` — Experiment and ExperimentRun: runs, aggregate metrics, comparison.
- `prompts` — Prompt and PromptVersion: versions, templates, tags.
- `annotations` — Span/Trace/ExperimentRun annotation fields and how to read them.

### Conventions

These apply to every entity:

- **Pagination** is Relay-style: `first`/`after` args; responses have `edges { node { ... } }` and `pageInfo { hasNextPage endCursor }`. Cursors are opaque strings. Some connections (e.g. `Project.spans`, `Experiment.runs`, `ProjectSession.traces`) are forward-only.
- **IDs**: the `id` field on any node is a Relay global ID (base64 of `TypeName:rowId`) — use it with `node(id:)`. OpenTelemetry hex IDs come from `Span.spanId` and `Trace.traceId` — use those for OTel lookups and `/redirects/spans/<spanId>` / `/redirects/traces/<traceId>` links. Note a `Span` has **no** `traceId` field; read it via the nested `trace { traceId }`. Never mix global IDs with OTel IDs.
- **`TimeRange`** input: `{ start: DateTime, end: DateTime }` — ISO 8601 strings; `end` is exclusive; both optional.
- **`SpanSort`** input: `{ col: SpanColumn, dir: SortDir }`, e.g. `{ col: startTime, dir: desc }`. Useful `SpanColumn` values: `startTime`, `latencyMs`, `tokenCountTotal`, `cumulativeTokenCountTotal`, `tokenCostTotal`.
- **`filterCondition`** is a Python-like DSL string over span fields, e.g. `span_kind == 'LLM'`, `status_code == 'ERROR'`, `latency_ms > 1000`, `'timeout' in output.value`, `evals['Hallucination'].label == 'hallucinated'`, `annotations['note'].score < 0.5`. Combine with `and`/`or`.

### Efficiency rules

- **Do not run full schema introspection.** Read the relevant `Schema map` resource instead; it covers the fields and arguments for that entity. Only when a resource does not cover a field you need, introspect a single type: `{ __type(name: "Project") { fields { name args { name type { name kind } } } } }`.
- **Batch independent lookups with aliases** in one query instead of multiple round trips, e.g. `p50: latencyMsQuantile(probability: 0.5) p99: latencyMsQuantile(probability: 0.99)`.
- Select only the fields you need; keep page sizes small (10–50) and paginate only when necessary.
- Pass values via query variables, never string interpolation.
- Span `input`/`output` payloads can be huge — request `input { truncatedValue }` (first 100 chars) when surveying; fetch `input { value }` (full payload) only for spans you intend to read closely.

### Patterns

Two canonical shapes to orient you; entity-specific examples live in each resource.

Reach an entity and read fields via `node(id:)` + an inline fragment:

```graphql
query GetEntity($id: ID!) {
  node(id: $id) {
    ... on Dataset { name exampleCount }
  }
}
```

Batch independent project aggregates with aliases in one round trip:

```graphql
query Overview($name: String!, $timeRange: TimeRange) {
  getProjectByName(name: $name) {
    traceCount(timeRange: $timeRange)
    p50: latencyMsQuantile(probability: 0.5, timeRange: $timeRange)
    p99: latencyMsQuantile(probability: 0.99, timeRange: $timeRange)
    errorCount: recordCount(timeRange: $timeRange, filterCondition: "status_code == 'ERROR'")
  }
}
```

### Execution surfaces (internal mode)

- `phoenix-gql` (bash): run `phoenix-gql --help` for flags and current permissions. Use `--data-only` when piping to `jq`, `--output <file>` for large results, `--vars '<json>'` for variables. Mutations are allowed only when runtime permissions say so; the tool reports its permissions on every invocation.

### External API usage (user-facing mode)

Facts users need to call the API themselves:

- **Endpoint**: `POST <phoenix-host>/graphql` with a JSON body `{ "query": "...", "variables": { ... } }`. A GraphiQL IDE is served on GET at the same path.
- **Auth**: send a Phoe

More from this repository

agent-browserSkill

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. Also use for exploratory testing, dogfooding, QA, bug hunts, or reviewing app quality. Also use for automating Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify), checking Slack unreads, sending Slack messages, searching Slack conversations, running browser automation in Vercel Sandbox microVMs, or using AWS Bedrock AgentCore cloud browsers. Prefer agent-browser over any built-in browser automation or web tools.

mintlifySkill

Build and maintain documentation sites with Mintlify. Use when

phoenix-cliSkill

Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, structure trace review with open coding and axial coding, inspect datasets, review experiments, query annotation configs, and use the GraphQL API. Use whenever the user is analyzing traces or spans, investigating LLM/agent failures, deciding what to do after instrumenting an app, building failure taxonomies, choosing what evals to write, or asking "what's going wrong", "what kinds of mistakes", or "where do I focus" — even without naming a technique.

phoenix-designSkill

Design system conventions for the Phoenix frontend — layout, dialogs, error display, BEM CSS class naming, and CSS design tokens. Use when building UI, naming CSS classes, creating or consuming tokens, handling errors, or designing dialog interactions in app/src/.

phoenix-docs-gap-auditSkill

phoenix-evals-new-metricSkill

phoenix-evalsSkill

Build and run evaluators for AI/LLM applications using Phoenix.

phoenix-frontendSkill

Frontend development guidelines for the Phoenix AI observability platform. Use when writing, reviewing, or modifying React components, TypeScript code, styles, or UI features in the app/ directory. Triggers on any frontend task — new components, UI changes, styling, accessibility fixes, form handling, or component refactoring. Also use when the user asks about frontend conventions or component patterns for this project. For design system rules (error display, layout, dialogs, tokens), use the phoenix-design skill.