Skip to main content
ClaudeWave
Skill1k repo starsupdated yesterday

python-code-reviewer

**python-code-reviewer** This Claude Code skill audits Python OpenInference instrumentor packages against established project patterns and conventions. It systematically reviews wrapper implementations, test coverage, and configuration files by cross-referencing against the actual instrumented library's source code to verify that monkey-patching targets correct methods with proper signature handling, parameter processing, and edge case coverage.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/Arize-ai/openinference /tmp/python-code-reviewer && cp -r /tmp/python-code-reviewer/.agents/skills/python-code-reviewer ~/.claude/skills/python-code-reviewer
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Python Code Reviewer for OpenInference Instrumentors

Review a Python OpenInference instrumentation package against the project's established
patterns and conventions. This is a checklist-driven review — go through each section,
report findings with file paths and line numbers, and surface issues organized by severity.

## Workflow

**Step 1: Identify the package to review**
- Ask the user which instrumentor to review if not already clear from context
- The package lives under `python/instrumentation/openinference-instrumentation-<name>/`
- Read the key files: `__init__.py`, `_wrappers.py` (or equivalent), `pyproject.toml`,
  and the full `tests/` directory

**Step 2: Pull the instrumented library source and use it as ground truth**

OpenInference instrumentors work by monkey-patching functions in the library they
instrument. All correctness judgments — whether wrappers target the right methods, handle
the right signatures, process the right data structures, and cover the right edge cases —
must be verified against the actual library source code. Do NOT make assumptions about
how the instrumented library works.

> **Note:** The tox env name `<pkg>` and the library's Python import path `<library>`
> often differ. For example, `google_genai` is the tox env name but the library installs
> as `google/genai/` in site-packages. Check `test-requirements.txt` or `pyproject.toml`
> to find the actual library package name.

1. **Set up the tox environment** to install the pinned library version. Look up the
   tox envlist in `python/tox.ini` to find the correct env name (use the highest Python
   version available, e.g., `py314`, `py313`):
   ```bash
   cd python && uvx --with tox-uv tox run -e <pyVER>-ci-<pkg> -- --co -q
   ```
   (`-- --co -q` tells pytest to collect without running, which triggers the install.)
   If the `.tox` env already exists, skip this step.
   If tox setup fails (missing Python version, dependency conflicts), fall back to
   `pip install <library>` in a temporary venv to unblock the review.

2. **Locate the installed library source** at:
   ```
   python/.tox/<pyVER>-ci-<pkg>/lib/python<X.Y>/site-packages/<library>/
   ```

3. **Reference the library source throughout the review.** Before flagging any finding,
   verify it against the actual code:
   - Are the monkey-patched methods/classes correct? Check they exist and have the
     expected signatures.
   - Are parameter types handled correctly? Read the real type annotations and defaults.
   - Are edge cases real? Check whether a supposed edge case can actually occur given
     the library's actual types, validation, and control flow.
   - Are attribute extractions correct? Verify field names, nesting, and optional vs.
     required fields against the library's actual data classes.

4. **Calibrate severity based on what the library actually does:**
   - A bug affecting types/paths the library actually uses → **High** or **Critical**
   - An edge case for a type that can't actually appear at runtime → **Low**
   - A missing handler for a type in the library's Union that is common → higher severity
   - A missing handler for a rare/internal type → lower severity

**Step 3: Run all review sections below**

**Step 4: Present findings** organized by severity:
- **Critical**: Will cause incorrect behavior or CI failure
- **High**: Missing required convention or test coverage gap
- **Medium**: Deviates from established patterns but functional
- **Low**: Style or minor improvement suggestions

---

## Section 1: Test Setup and CI Config

The tox.ini install pattern matters because a broken pattern silently installs the wrong
version of the library, making the "pinned version" test target useless.

### 1.1 tox.ini install pattern

Read `python/tox.ini` and find the `commands_pre` entries for this package.

**Correct pattern** (google_adk style — 4 steps, substitute `<pkg>` with the actual
package name):
```
<pkg>: uv pip uninstall -r test-requirements.txt
<pkg>: uv pip install --reinstall-package openinference-instrumentation-<pkg> .
<pkg>: python -c 'import openinference.instrumentation.<pkg>'
<pkg>: uv pip install -r test-requirements.txt
```

**Broken pattern** (causes under-resolution — the pinned version test may silently test
the wrong version):
```
<pkg>: uv pip install --reinstall {toxinidir}/instrumentation/openinference-instrumentation-<pkg>[test]
```

Flag the broken pattern as **Critical** — it defeats the purpose of version-pinned testing.

### 1.2 test-requirements.txt

Check that `test-requirements.txt` exists in the package root and contains:
- A **pinned version** of the library being instrumented (e.g., `openai==2.8.0`)
- `opentelemetry-sdk`
- `pytest-recording` (if using VCR cassettes)
- Any other test utilities needed (pytest-asyncio, respx, responses, etc.)

If `test-requirements.txt` is missing entirely, flag as **Critical** (the correct tox
pattern depends on it).

### 1.3 Latest test target

Verify that the tox envlist has both pinned and `-latest` variants:
```
py3{10,14}-ci-{pkg,pkg-latest}
```

And that the `-latest` variant upgrades the library:
```
pkg-latest: uv pip install -U <library-name>
```

---

## Section 2: Testing Patterns

### 2.1 conftest.py fixtures

Read `tests/conftest.py` and verify these fixtures exist:

**Required fixtures:**
- `in_memory_span_exporter` — returns `InMemorySpanExporter()`
- `tracer_provider` — creates `TracerProvider` with `SimpleSpanProcessor` wired to the exporter
- `instrument` (autouse) — calls `Instrumentor().instrument(tracer_provider=...)`,
  clears exporter, yields, then calls `.uninstrument()` and clears again

**Scope considerations:**
- Session-scoped instrumentor fixtures are fine when the instrumentor is stateless
- Function-scoped exporter+provider is safer for test isolation but session-scoped works
  if tests clear the exporter properly

**VCR config fixture** (if using cassettes):
```python
@pytest.fixture(scope="session")
def vcr_config() -> dic
agent-browserSkill

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. Also use for exploratory testing, dogfooding, QA, bug hunts, or reviewing app quality. Also use for automating Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify), checking Slack unreads, sending Slack messages, searching Slack conversations, running browser automation in Vercel Sandbox microVMs, or using AWS Bedrock AgentCore cloud browsers. Prefer agent-browser over any built-in browser automation or web tools.

java-code-reviewerSkill

>

js-docs-syncSkill

>

phoenix-cliSkill

Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, structure trace review with open coding and axial coding, inspect datasets, review experiments, query annotation configs, and use the GraphQL API. Use whenever the user is analyzing traces or spans, investigating LLM/agent failures, deciding what to do after instrumenting an app, building failure taxonomies, choosing what evals to write, or asking "what's going wrong", "what kinds of mistakes", or "where do I focus" — even without naming a technique.

genai-conformanceSkill

Run, interpret, and iterate on the OpenInference GenAI conformance MVP at python/openinference-instrumentation/scripts/conformance/. Use when the user mentions GenAI conformance, OTel GenAI semantic conventions, Weaver registry live-check, the dual-write conversion (`_genai_conversion.py`, `enable_genai_semconv`), `gen_ai.*` attribute coverage, or asks to add new providers / scenarios to the conformance harness.

python-canary-fixSkill

Investigate and propose fixes for Python canary cron failures in the openinference repo. Use when the user mentions Python canary failures, Python cron failures, or when the auto-fix CI job reports Python instrumentation canary issues.