Skip to main content
ClaudeWave
Skill0 repo starsupdated yesterday

octoperf-async-polling

The octoperf-async-polling skill defines reliable polling patterns for OctoPerf operations that return immediately with task or result handles (benchResultId, taskId) and require status checks until completion. It specifies poll intervals based on expected operation duration, terminal conditions indicating success or failure, and anti-patterns to avoid tight-looping or context waste when monitoring long-running tasks like scenario runs, virtual user validation, PDF exports, and correlation workflows.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/OctoPerf/octoperf-claude-plugins /tmp/octoperf-async-polling && cp -r /tmp/octoperf-async-polling/plugins/octoperf/skills/octoperf-async-polling ~/.claude/skills/octoperf-async-polling
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# OctoPerf — Polling async operations reliably

Several OctoPerf MCP tools kick off work that runs for seconds to
hours and return immediately with a handle (`benchResultId`,
`taskId`, …). The actual result lives behind a second "get-status"
tool that you poll until terminal. This skill is the **single source
of truth** for how to do that polling without burning the context
window or missing an early failure.

## When this applies

You started one of:

| Starter tool                                       | Returns                               | Status tool to poll                                       |
|----------------------------------------------------|---------------------------------------|-----------------------------------------------------------|
| `validate_virtual_user`                            | `benchResultId`                       | `get_virtual_user_validation` (or `get_bench_result`)     |
| `run_scenario`                                     | `benchReportId` + `benchResultIds[]`  | `get_bench_result` (state) and/or `get_bench_status` (%)  |
| `export_bench_report_pdf`                          | `taskId`                              | `get_task_result`                                         |
| Async correlation task (auto-correlate workflow)   | `taskId`                              | `get_task_result`                                         |

If your call returned the final answer (no `taskId` / no
`benchResultId`), you're not in async territory — skip this skill.

## The pacing rule

**Pace the polls.** Each MCP roundtrip takes 100–500 ms of
context. Calling the status tool in tight succession over a 10-min
test would consume 100+ tool turns with identical responses. Put a
delay of `N` between polls — the *how* depends on your harness (see
"How to insert the delay" below).

Pick `N` from the expected duration of the work:

| Expected duration              | Sleep between polls (`N`) |
|--------------------------------|---------------------------|
| Validation run (~30s)          | 5s                        |
| 1-min smoke scenario           | 5–10s                     |
| 5-min scenario                 | 30s                       |
| 10-min scenario                | 60s                       |
| 30-min + soak / load test      | 60s (cap)                 |
| PDF export task (10–60s)       | 3s                        |
| Async correlation (5–60s)      | 3s                        |

Rule of thumb: `N ≈ expected_duration / 10`, clamped to `[3s, 60s]`.
The cap matters — there is no benefit to sleeping longer than 60s
even on a multi-hour run, because a hard cap keeps the LLM
responsive if the run aborts early (ERROR, manual stop).

### How to insert the delay (harness-dependent)

The cadence table is *what* — the *how* depends on which agent harness
is driving this server:

- **Harnesses that allow a bounded blocking sleep** (most CLI agents):
  insert a bounded `Bash sleep N` (or equivalent) between two MCP
  calls. Do **not** chain multiple short sleeps to fake a longer one.
- **Claude Code blocks *all* blocking sleeps used as a wait** — even a
  bounded `sleep 60` between two MCP calls is intercepted (it suggests
  `Monitor` or `run_in_background` instead). Use its native
  recurring-prompt mechanism: `/loop <interval> <poll-prompt>` (or
  `CronCreate` directly) re-enters the poll on a wall-clock interval,
  so the cadence table maps straight to the `/loop` interval (`60s`
  rounds to cron's 1-min minimum). Each fire re-invokes the MCP status
  tool; stop the loop with `CronDelete` once the state is terminal.

**Event-watchers don't help here.** The run state lives *behind an MCP
status tool*, not in any file, log line, or local process. So
shell-level watchers (Claude Code's `Monitor`, `tail -f`,
`inotifywait`, a `run_in_background` until-loop) cannot observe it
without calling the platform REST API directly — which needs the OAuth
token the MCP server holds. The wake mechanism must re-call the MCP
status tool, which is exactly what a recurring prompt does.

## The polling loop

Pseudocode:

```
start    = start_async_tool(...)        # returns id (taskId / benchResultId)
handle   = start.benchResultId | start.taskId
duration = estimated_run_seconds        # from the scenario / task

while True:
    status = get_status_tool(handle)
    if is_terminal(status):
        break
    wait N                              # bounded sleep, or harness scheduler — see below
```

### Terminal conditions (per status tool)

- `get_virtual_user_validation` → `finished == true`. The `state` field
  is one of `CREATED / PENDING / SCALING / PREPARING / INITIALIZING /
  RUNNING / FINISHED / ABORTED / ERROR`; treat the last three as
  terminal.
- `get_bench_result` → `state ∈ {FINISHED, ABORTED, ERROR}`. Same
  state machine as validation (validation runs reuse the bench
  pipeline). This is the **canonical** terminal check for a load run.
- `get_bench_status` → returns a numeric progress value. **Do not use
  this for terminal detection.** It returns elapsed-as-percent (0 →
  ~100) for in-flight runs and is useful for "show progress to the
  user" updates, but it does not flip to a sentinel value when the
  run reaches ABORTED / ERROR. Always cross-check with
  `get_bench_result.state`.
- `get_task_result` → `status ∈ {SUCCESS, FAILED}`. `PENDING` is the
  in-flight value. The `message` field carries the backend error on
  `FAILED` — surface it verbatim and stop.

### After the loop

The status tool already tells you the outcome (FINISHED vs ABORTED vs
ERROR, SUCCESS vs FAILED). Branch from there:

- **Terminal-success path** (`FINISHED` / `SUCCESS`) → proceed to the
  workflow-specific next step (read report, fetch URL, …).
- **Terminal-failure path** (`ABORTED` / `ERROR` / `FAILED`) → do
  not retry blindly. Pull the explanation from the matching skill:
  - `octoperf-validation-triage` (`benchResultId` failed validation)
  - `octoperf-scenario-diagnosis` (`benchResultId` failed unde
octoperf-auto-correlationSkill

Use when an OctoPerf Virtual User imported from a HAR/Postman/JMX recording fails its validation run because dynamic values (session tokens, CSRF, signed URLs, anti-forgery inputs, auth challenges) captured at recording time are stale on replay. Triggers on requests for "auto-correlation", "correlate the VU", "fix replay errors", "401/403 on replay after import", "tokens don't match", "signature mismatch in load test". Walks the LLM through framework preset selection, async polling, and regex-rule fallback. Requires the OctoPerf MCP server to be connected.

octoperf-bench-reportsSkill

Use when reading or interpreting an OctoPerf bench report — picking the right `get_report_*_values` tool for a given widget, understanding the difference between flat and trend reports, decoding semantic gotchas (Hits vs Hits CONTAINER, 304 cache hits skewing throughput, Playwright per-step row types, etc.). Triggers on "what's the right tool for this widget", "explain this metric", "how do I read this trend report", "what does parallelRunsSupported mean", "why is the Network row 24ms while page.goto is 364ms", "DELTA computeType". Complements `octoperf-scenario-diagnosis` — that skill walks the diagnosis workflow, this one is the widget-by-widget reading guide. Requires the OctoPerf MCP server.

octoperf-export-bench-report-pdfSkill

Use when the user asks to "export the report as PDF", "print the bench report", "get a PDF of report X", "share a PDF with stakeholders", or any variation that calls for a static artefact of an OctoPerf benchReport. Walks the LLM through the three-step async chain (submit print task → poll → download presigned URL). Requires the OctoPerf MCP server to be connected.

octoperf-real-browser-probeSkill

Use when the user wants to run a real-browser probe alongside a JMeter HTTP load test to capture user-perceived metrics (page load time, render time, JS execution, Core Web Vitals) while JMeter generates the bulk HTTP load. Triggers on "real browser monitoring during load test", "EUM probe", "playwright probe", "synthetic monitor during bench", "convert my JMeter VU to Playwright", "RealBrowser user", "TruClient equivalent", "hybrid load test (HTTP + browser)". Walks the LLM through JMeter→Playwright VU conversion (direct translation or codegen capture) and hybrid scenario composition (N×JMeter for load + 1×Playwright probe for UX measurement). Requires the OctoPerf MCP server.

octoperf-scenario-diagnosisSkill

Use when an OctoPerf load-test scenario has completed (or is running) and the user wants to understand why it failed, underperformed, or behaved unexpectedly. Triggers on "the load test failed", "why are response times so high", "high error rate in the scenario", "diagnose this bench", "the run looks bad". Walks the LLM through reading global metrics, narrowing scope, comparing against validation, and surfacing the right next step (re-validate, tune scenario, fix infra). Requires the OctoPerf MCP server and a `benchResultId` to investigate.

octoperf-schedulingSkill

Use when scheduling an OctoPerf scenario to run at a specific time (one-shot) or on a recurring cadence (cron), or when listing / pausing / resuming / deleting an existing schedule. Triggers on "schedule the scenario for tomorrow morning", "run this every weekday at 8am", "every night at midnight", "pause the cron job", "delete the schedule", "show scheduled jobs". Covers the unusual cron format (Unix 5-field UTC, NOT Quartz), the timezone conversion gymnastics, the pre-flight rule (a misconfigured scenario will fire failing runs forever until disabled), and the full job lifecycle. Requires the OctoPerf MCP server.

octoperf-validation-triageSkill

Use when an OctoPerf Virtual User validation run has produced many failing actions and the user needs to diagnose them efficiently without reading every single failure serially. Triggers on "the validation is red", "lots of errors after import", "VU validation failed, what's wrong", "triage these failures", "why is my virtual user failing". Groups failures by category, drills into one representative per group, and proposes the matching MCP-tool fix. Requires the OctoPerf MCP server.