Skill38.3k repo starsupdated 5d ago

temporal-python-testing

This Claude Code skill provides comprehensive testing strategies for Temporal Python workflows, covering unit testing with time-skipping, integration testing with mocked activities, replay testing for determinism validation, and local development setup. Use this skill when implementing test suites for Temporal workflows, debugging test failures, setting up CI/CD testing pipelines, or establishing testing infrastructure that achieves adequate code coverage without requiring slow end-to-end tests against a live server.

View source Repository: agents

Install in Claude Code

Copy

git clone --depth 1 https://github.com/wshobson/agents /tmp/temporal-python-testing && cp -r /tmp/temporal-python-testing/plugins/backend-development/skills/temporal-python-testing ~/.claude/skills/temporal-python-testing

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Temporal Python Testing Strategies

Comprehensive testing approaches for Temporal workflows using pytest, progressive disclosure resources for specific testing scenarios.

## When to Use This Skill

- **Unit testing workflows** - Fast tests with time-skipping
- **Integration testing** - Workflows with mocked activities
- **Replay testing** - Validate determinism against production histories
- **Local development** - Set up Temporal server and pytest
- **CI/CD integration** - Automated testing pipelines
- **Coverage strategies** - Achieve ≥80% test coverage

## Testing Philosophy

**Recommended Approach** (Source: docs.temporal.io/develop/python/testing-suite):

- Write majority as integration tests
- Use pytest with async fixtures
- Time-skipping enables fast feedback (month-long workflows → seconds)
- Mock activities to isolate workflow logic
- Validate determinism with replay testing

**Three Test Types**:

1. **Unit**: Workflows with time-skipping, activities with ActivityEnvironment
2. **Integration**: Workers with mocked activities
3. **End-to-end**: Full Temporal server with real activities (use sparingly)

## Available Resources

This skill provides detailed guidance through progressive disclosure. Load specific resources based on your testing needs:

### Unit Testing Resources

**File**: `resources/unit-testing.md`
**When to load**: Testing individual workflows or activities in isolation
**Contains**:

- WorkflowEnvironment with time-skipping
- ActivityEnvironment for activity testing
- Fast execution of long-running workflows
- Manual time advancement patterns
- pytest fixtures and patterns

### Integration Testing Resources

**File**: `resources/integration-testing.md`
**When to load**: Testing workflows with mocked external dependencies
**Contains**:

- Activity mocking strategies
- Error injection patterns
- Multi-activity workflow testing
- Signal and query testing
- Coverage strategies

### Replay Testing Resources

**File**: `resources/replay-testing.md`
**When to load**: Validating determinism or deploying workflow changes
**Contains**:

- Determinism validation
- Production history replay
- CI/CD integration patterns
- Version compatibility testing

### Local Development Resources

**File**: `resources/local-setup.md`
**When to load**: Setting up development environment
**Contains**:

- Docker Compose configuration
- pytest setup and configuration
- Coverage tool integration
- Development workflow

## Quick Start Guide

### Basic Workflow Test

```python
import pytest
from temporalio.testing import WorkflowEnvironment
from temporalio.worker import Worker

@pytest.fixture
async def workflow_env():
    env = await WorkflowEnvironment.start_time_skipping()
    yield env
    await env.shutdown()

@pytest.mark.asyncio
async def test_workflow(workflow_env):
    async with Worker(
        workflow_env.client,
        task_queue="test-queue",
        workflows=[YourWorkflow],
        activities=[your_activity],
    ):
        result = await workflow_env.client.execute_workflow(
            YourWorkflow.run,
            args,
            id="test-wf-id",
            task_queue="test-queue",
        )
        assert result == expected
```

### Basic Activity Test

```python
from temporalio.testing import ActivityEnvironment

async def test_activity():
    env = ActivityEnvironment()
    result = await env.run(your_activity, "test-input")
    assert result == expected_output
```

## Coverage Targets

**Recommended Coverage** (Source: docs.temporal.io best practices):

- **Workflows**: ≥80% logic coverage
- **Activities**: ≥80% logic coverage
- **Integration**: Critical paths with mocked activities
- **Replay**: All workflow versions before deployment

## Key Testing Principles

1. **Time-Skipping** - Month-long workflows test in seconds
2. **Mock Activities** - Isolate workflow logic from external dependencies
3. **Replay Testing** - Validate determinism before deployment
4. **High Coverage** - ≥80% target for production workflows
5. **Fast Feedback** - Unit tests run in milliseconds

## How to Use Resources

**Load specific resource when needed**:

- "Show me unit testing patterns" → Load `resources/unit-testing.md`
- "How do I mock activities?" → Load `resources/integration-testing.md`
- "Setup local Temporal server" → Load `resources/local-setup.md`
- "Validate determinism" → Load `resources/replay-testing.md`

## Additional References

- Python SDK Testing: docs.temporal.io/develop/python/testing-suite
- Testing Patterns: github.com/temporalio/temporal/blob/main/docs/development/testing.md
- Python Samples: github.com/temporalio/samples-python

More from this repository

screen-reader-testingSkill

Test web applications with screen readers including VoiceOver, NVDA, and JAWS. Use when validating screen reader compatibility, debugging accessibility issues, or ensuring assistive technology support.

wcag-audit-patternsSkill

Conduct WCAG 2.2 accessibility audits with automated testing, manual verification, and remediation guidance. Use when auditing websites for accessibility, fixing WCAG violations, or implementing accessible design patterns.

multi-reviewer-patternsSkill

Coordinate parallel code reviews across multiple quality dimensions with finding deduplication, severity calibration, and consolidated reporting. Use this skill when organizing multi-reviewer code reviews, calibrating finding severity, or consolidating review results.

parallel-debuggingSkill

Debug complex issues using competing hypotheses with parallel investigation, evidence collection, and root cause arbitration. Use this skill when debugging bugs with multiple potential causes, performing root cause analysis, or organizing parallel investigation workflows.

parallel-feature-developmentSkill

Coordinate parallel feature development with file ownership strategies, conflict avoidance rules, and integration patterns for multi-agent implementation. Use this skill when decomposing a large feature into independent work streams, when two or more agents need to implement different layers of the same system simultaneously, when establishing file ownership to prevent merge conflicts in a shared codebase, when designing interface contracts so parallel implementers can build against each other's APIs before they are ready, or when deciding whether to use vertical slices versus horizontal layers for a full-stack feature.

task-coordination-strategiesSkill

Decompose complex tasks, design dependency graphs, and coordinate multi-agent work with proper task descriptions and workload balancing. Use this skill when breaking down work for agent teams, managing task dependencies, or monitoring team progress.

team-communication-protocolsSkill

Structured messaging protocols for agent team communication including message type selection, plan approval, shutdown procedures, and anti-patterns to avoid. Use this skill when establishing communication norms for a newly spawned team, when deciding whether to send a direct message or a broadcast, when a team-lead needs to review and approve an implementer's plan before work begins, when orchestrating a graceful team shutdown after all tasks are complete, or when debugging why teammates are not coordinating correctly at integration points.

team-composition-patternsSkill

Design optimal agent team compositions with sizing heuristics, preset configurations, and agent type selection. Use this skill when deciding how many agents to spawn for a task, when choosing between a review team versus a feature team versus a debug team, when selecting the correct subagent_type for each role to ensure agents have the tools they need, when configuring display modes (tmux, iTerm2, in-process) for a CI or local environment, or when building a custom team composition for a non-standard workflow such as a migration or security audit.