matlab-setup-worker-state
This skill teaches MATLAB developers how to set up non-serializable resources, expensive computations, and environment configurations on worker processes in parallel pools. Use it when workers need database connections, loaded libraries, file handles, or one-time setup operations that cannot be serialized or should not repeat per iteration, or when encountering serialization errors when passing objects to parfor loops.
git clone --depth 1 https://github.com/matlab/matlab-agentic-toolkit /tmp/matlab-setup-worker-state && cp -r /tmp/matlab-setup-worker-state/skills-catalog/parallel-computing/matlab-setup-worker-state ~/.claude/skills/matlab-setup-worker-stateSKILL.md
# Set Up Worker State for Parallel Pools
By default, process workers in a parallel pool inherit MATLAB path state from
their controlling client, but they do not inherit loaded libraries, open
connections, or expensive pre-computed objects. Code that relies on any of
these needs explicit setup. This skill teaches the correct APIs for each
scenario. Most patterns target process-based pools; thread pool applicability
is noted where relevant.
## When to Use
- Code needs a **non-serializable resource** on workers (database connections,
COM objects, loaded shared libraries, file handles)
- Code needs **expensive one-time setup** per worker (large object construction,
data loading) that should not repeat every parfor iteration
- Code needs **paths or environment variables** set on workers
- User has existing code using **spmd for side-effect setup** before parfor
(anti-pattern — help them modernise)
- User sees errors about objects not being serializable when passed to parfor
## When NOT to Use
- Data already in client memory used in a single parfor loop (MATLAB broadcasts it automatically — Constant adds complexity for no benefit in this case)
- Choosing between process and thread pools (out of scope — this skill assumes a pool type is already chosen)
## Decision Framework
| Scenario | Correct API | Why |
|----------|-------------|-----|
| Non-serializable resource needing cleanup (connections, libraries) | `parallel.pool.Constant(@buildFcn, @cleanupFcn)` | Constructs on each worker; automatic cleanup on delete |
| Data from a file needed on workers | `parallel.pool.Constant(@() load(file).var)` | **Default for file-based data.** Each worker loads from disk; client never holds the dataset. Always prefer this when the source is a file. |
| Data already in client memory, used in multiple parfor loops | `parallel.pool.Constant(data)` | Transfers once for pool lifetime; without Constant, broadcast re-sends for every parfor loop. Not needed for a single parfor — let MATLAB broadcast. |
| One-shot side effect, no return value needed | `parfevalOnAll(pool, @fcn, 0)` | Runs once on all workers; use `fetchOutputs` to surface errors |
| Paths needed on workers — parpool call is in your code | `parpool(..., AdditionalPaths=paths)` | Cleanest option when you can modify the parpool call; client path entries are inherited automatically |
| Paths needed on workers — pool opened elsewhere (can't modify call) | `parfevalOnAll(pool, @addpath, 0, p)` | Never delete and recreate a pool just to add paths — use parfevalOnAll on the existing pool |
| Environment variables on workers — parpool call is in your code | `parpool(..., EnvironmentVariables=vars)` | Forwards named env vars from client to workers at startup; set values with setenv on the client before this call |
| Environment variables on workers — pool opened elsewhere (can't modify call) | `parfevalOnAll(pool, @setenv, 0, k, v)` | Never delete and recreate a pool just to set env vars — use parfevalOnAll on the existing pool |
**Thread pool notes:** `parallel.pool.Constant` and `parfevalOnAll` also work on
thread pools. However, thread workers share the client's process, so path and
environment variable changes on the client are visible to threads automatically —
`AdditionalPaths` and `EnvironmentVariables` do not apply. To modify a thread
worker's environment, alter the client environment before the parfor.
## The spmd Anti-Pattern
This applies equally to process pools and thread pools — prefer
`parallel.pool.Constant` over `spmd` for worker state setup regardless of pool
type.
### What it looks like
```matlab
% ANTI-PATTERN: Do not do this
spmd
loadlibrary("mylib", "mylib.h");
end
parfor i = 1:100
result(i) = calllib("mylib", "compute", data(i));
end
spmd
unloadlibrary("mylib");
end
```
### Why it's fragile
1. **No cleanup guarantee** — if the parfor errors, the second spmd never runs
2. **Makes the pool fragile to worker disconnection** — once a pool has run an
`spmd` block, a single worker losing connection tears down the entire pool.
Pools that never use `spmd` continue operating with fewer workers if one
disconnects (no replacement occurs — the pool simply shrinks).
3. **Composite variables cannot be used inside `parfor`** — values created in
`spmd` are Composite objects that require indexing on the client; they
cannot be referenced directly from within a `parfor` body
### How to modernise
Replace with `parallel.pool.Constant`:
```matlab
C = parallel.pool.Constant( ...
@() loadAndReturn(), ...
@(~) unloadlibrary("mylib")); %#ok<NASGU> — kept alive for cleanup
result = zeros(1, 100);
parfor i = 1:100
result(i) = calllib("mylib", "compute", data(i));
end
% Cleanup happens automatically when C goes out of scope
```
Or with `parfevalOnAll` for simple side effects:
```matlab
f = parfevalOnAll(pool, @setupWorker, 0);
fetchOutputs(f);
result = zeros(1, 100);
parfor i = 1:100
result(i) = doWork(data(i));
end
% No automatic cleanup
```
Prefer `parallel.pool.Constant` when cleanup is needed. Use `parfevalOnAll`
only for operations where cleanup is not required.
## Patterns
### Pattern 1: Loading data from a file for use in parfor
**Always use the build-function form when the source is a file.** This avoids
loading the data into client memory entirely — each worker loads directly from
disk. This is critical for large files but is also the correct default for any
file-based data because it scales without code changes as file size grows.
```matlab
c = parallel.pool.Constant(@() load("costSurface.mat").costSurface);
results = zeros(1, 10000);
parfor i = 1:10000
results(i) = processWithLookup(input(i), c.Value);
end
```
Each worker calls `load()` once. The client never holds the full dataset.
**Do NOT do this** — loading on the client then wrapping defeats the purpose:
```matlab
% WRONG: loads entire file into client memory, then copies to each worker
data = load("costSurface.mat")>
Import recorded driving sensor data (GPS, camera, lidar, actor tracks, lanes) into scenariobuilder.* objects (GPSData, CameraData, LidarData, ActorTrackData, Trajectory, laneData) and run preprocessing — synchronize, offset correction, crop, normalizeTimestamps, convertTimestamps. Also: compute actor tracks from lidar when no annotations exist, attach camera/lidar mounting + intrinsics, export to MAT/workspace/timetable/script. Use for raw driving dataset files (KITTI, nuScenes, Waymo, Pandaset, ROS/ROS2 bags, .mat, .csv, .mp4) or driving/vehicle/sensor logs that need wrapping. drivingLogAnalyzer (DLA) is OPT-IN ONLY — invoke only on explicit user request ('DLA', 'open in DLA', 'inspect/explore/analyze the recording') or reported sensor problem (sync drift, timestamp mismatch, overlay misalignment). NEVER auto-launch DLA after wrapping (Rule 0). For 'build scenario / export to RoadRunner / drivingScenario / OpenSCENARIO / Unreal / simulate', hand off to matlab-scenario-builder.
Generate driving scenes, scenarios, road surfaces, and 3D content from already-wrapped scenariobuilder.* sensor data (GPS, camera, lidar, actor tracks) using Scenario Builder for Automated Driving Toolbox. Use to BUILD, EXPORT, or AUGMENT a virtual scenario/scene/map: ego or actor trajectories, trajectory smoothing, OpenCRG road-surface extraction, 3D asset generation, static-object placement, point-cloud georeferencing + elevation, lane-based ego localization, sensor-fusion tracking, scenario-event extraction (cut-ins, hard brakes, near-misses, ADAS disengagements), or export to RoadRunner, drivingScenario, OpenDRIVE, OpenCRG, OpenSCENARIO, or Unreal Engine. Also: log-to-scenario, scenario harvesting, accident/near-miss reconstruction, SOTIF (ISO 21448) and ISO 26262 scenario coverage, USGS-aerial-lidar scene augmentation, traffic-sign placement from camera+lidar logs. NOT for raw-data import or multi-sensor sync/crop/offset/timestamp normalization — route those to matlab-driving-data-importer.
>
>
>
>
Build, modify, and diagram SimBiology models — API reference, helper functions, and layout patterns. Use when constructing or editing models programmatically or visually.