Skip to main content
ClaudeWave
Skill10.1k estrellas del repoactualizado today

phoenix-playwright-tests

The phoenix-playwright-tests skill provides templates and guidance for writing end-to-end browser automation tests for the Phoenix AI observability platform using Playwright. Use it when creating, updating, or debugging Playwright tests, writing tests for UI features, or automating browser interactions for Phoenix's web application. It includes test credentials for different user roles, a centralized timeout configuration approach, and selector patterns prioritized by robustness, from role-based selectors to test IDs.

Instalar en Claude Code
Copiar
git clone --depth 1 https://github.com/Arize-ai/phoenix /tmp/phoenix-playwright-tests && cp -r /tmp/phoenix-playwright-tests/.agents/skills/phoenix-playwright-tests ~/.claude/skills/phoenix-playwright-tests
Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

SKILL.md

# Phoenix Playwright Test Writing

Write end-to-end tests for Phoenix using Playwright. Tests live in `app/tests/` and follow established patterns.

## Timeout Policy

- Do not pass timeout args in test code under `app/tests`.
- Tune timing centrally in `app/playwright.config.ts` (global `timeout`, `expect.timeout`, `use.navigationTimeout`, and `webServer.timeout`).

## Quick Start

```typescript
import { expect, test } from "@playwright/test";
import { randomUUID } from "crypto";

test.describe("Feature Name", () => {
  test.beforeEach(async ({ page }) => {
    await page.goto(`/login`);
    await page.getByLabel("Email").fill("admin@localhost");
    await page.getByLabel("Password").fill("admin123");
    await page.getByRole("button", { name: "Log In", exact: true }).click();
    await page.waitForURL("**/projects");
  });

  test("can do something", async ({ page }) => {
    // Test implementation
  });
});
```

## Test Credentials

| User   | Email                | Password  | Role   |
| ------ | -------------------- | --------- | ------ |
| Admin  | admin@localhost      | admin123  | admin  |
| Member | member@localhost.com | member123 | member |
| Viewer | viewer@localhost.com | viewer123 | viewer |

## Selector Patterns (Priority Order)

1. **Role selectors** (most robust):

   ```typescript
   page.getByRole("button", { name: "Save" });
   page.getByRole("link", { name: "Datasets" });
   page.getByRole("tab", { name: /Evaluators/i });
   page.getByRole("menuitem", { name: "Edit" });
   page.getByRole("cell", { name: "my-item" });
   page.getByRole("heading", { name: "Title" });
   page.getByRole("dialog");
   page.getByRole("textbox", { name: "Name" });
   page.getByRole("combobox", { name: /mapping/i });
   ```

2. **Label selectors**:

   ```typescript
   page.getByLabel("Email");
   page.getByLabel("Dataset Name");
   page.getByLabel("Description");
   ```

3. **Text selectors**:

   ```typescript
   page.getByText("No evaluators added");
   page.getByPlaceholder("Search...");
   ```

4. **Test IDs** (when available):

   ```typescript
   page.getByTestId("create-dataset-button");
   // element with state — select the stable id, filter on the data attribute
   page.locator('[data-testid="llm-evaluator-form-submit-button"][data-mode="create"]');
   ```

   `data-testid`s are scoped, fully spelled out (`...-button`, never `...-btn`), and
   **constant regardless of state** — state is exposed via a sibling `data-*`
   attribute (`data-mode`, `data-state`, …), so never key a `getByTestId` off a value
   that only exists in one mode. If you need a `data-testid` that doesn't exist yet,
   add it following `rules/test-ids.md` in the `phoenix-frontend` skill (pattern:
   `<scope>-<subject>-<role>`).

5. **CSS locators** (last resort):
   ```typescript
   page.locator('button:has-text("Save")');
   ```

## Common UI Patterns

### Dropdown Menus

```typescript
// Click button to open dropdown
await page.getByRole("button", { name: "New Dataset" }).click();
// Select menu item
await page.getByRole("menuitem", { name: "New Dataset" }).click();
```

### Nested Menus (Submenus)

```typescript
// Open menu, hover over submenu trigger, click submenu item
await page.getByRole("button", { name: "Add evaluator" }).click();
await page
  .getByRole("menuitem", { name: "Use LLM evaluator template" })
  .hover();
await page.getByRole("menuitem", { name: /correctness/i }).click();

// IMPORTANT: Always use getByRole("menuitem") for submenu items, not getByText()
// Playwright's auto-waiting handles the submenu appearance timing
// ❌ BAD - flaky in CI:
// await page.getByText("ExactMatch").first().click();
// ✅ GOOD - reliable:
// await page.getByRole("menuitem", { name: /ExactMatch/i }).click();
```

### Dialogs/Modals

```typescript
// Wait for dialog
await expect(page.getByRole("dialog")).toBeVisible();
// Fill form in dialog
await page.getByLabel("Name").fill("test-name");
// Submit
await page.getByRole("button", { name: "Create" }).click();
// Wait for close
await expect(page.getByRole("dialog")).not.toBeVisible();
```

### Tables with Row Actions

```typescript
// Find row by cell content
const row = page.getByRole("row").filter({
  has: page.getByRole("cell", { name: "item-name" }),
});
// Click action button in row (usually last button)
await row.getByRole("button").last().click();
// Select action from menu
await page.getByRole("menuitem", { name: "Edit" }).click();
```

### Tabs

```typescript
await page.getByRole("tab", { name: /Evaluators/i }).click();
await page.waitForURL("**/evaluators");
await expect(page.getByRole("tab", { name: /Evaluators/i })).toHaveAttribute(
  "aria-selected",
  "true",
);
```

### Form Inputs in Sections

```typescript
// When multiple textboxes exist, scope to section
const systemSection = page.locator('button:has-text("System")');
const systemTextbox = systemSection
  .locator("..")
  .locator("..")
  .getByRole("textbox");
await systemTextbox.fill("content");
```

## Serial Tests (Shared State)

Use `test.describe.serial` when tests depend on each other:

```typescript
test.describe.serial("Workflow", () => {
  const itemName = `item-${randomUUID()}`;

  test("step 1: create item", async ({ page }) => {
    // Creates itemName
  });

  test("step 2: edit item", async ({ page }) => {
    // Uses itemName from previous test
  });

  test("step 3: verify edits", async ({ page }) => {
    // Verifies itemName was edited
  });
});
```

## Assertions

```typescript
// Visibility
await expect(element).toBeVisible();
await expect(element).not.toBeVisible();

// Text content
await expect(element).toHaveText("expected");
await expect(element).toContainText("partial");

// Attributes
await expect(element).toHaveAttribute("aria-selected", "true");

// Input values
await expect(input).toHaveValue("expected value");

// URL
await page.waitForURL("**/datasets/**/examples");
```

## Navigation Patterns

```typescript
// Direct navigation
await page.goto("/datasets");
await
agent-browserSkill

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. Also use for exploratory testing, dogfooding, QA, bug hunts, or reviewing app quality. Also use for automating Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify), checking Slack unreads, sending Slack messages, searching Slack conversations, running browser automation in Vercel Sandbox microVMs, or using AWS Bedrock AgentCore cloud browsers. Prefer agent-browser over any built-in browser automation or web tools.

mintlifySkill

Build and maintain documentation sites with Mintlify. Use when

phoenix-cliSkill

Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, structure trace review with open coding and axial coding, inspect datasets, review experiments, query annotation configs, and use the GraphQL API. Use whenever the user is analyzing traces or spans, investigating LLM/agent failures, deciding what to do after instrumenting an app, building failure taxonomies, choosing what evals to write, or asking "what's going wrong", "what kinds of mistakes", or "where do I focus" — even without naming a technique.

phoenix-designSkill

Design system conventions for the Phoenix frontend — layout, dialogs, error display, BEM CSS class naming, and CSS design tokens. Use when building UI, naming CSS classes, creating or consuming tokens, handling errors, or designing dialog interactions in app/src/.

phoenix-docs-gap-auditSkill

>

phoenix-evals-new-metricSkill

>-

phoenix-evalsSkill

Build and run evaluators for AI/LLM applications using Phoenix.

phoenix-frontendSkill

Frontend development guidelines for the Phoenix AI observability platform. Use when writing, reviewing, or modifying React components, TypeScript code, styles, or UI features in the app/ directory. Triggers on any frontend task — new components, UI changes, styling, accessibility fixes, form handling, or component refactoring. Also use when the user asks about frontend conventions or component patterns for this project. For design system rules (error display, layout, dialogs, tokens), use the phoenix-design skill.