Skip to main content
ClaudeWave
Skill649 repo starsupdated today

matlab-deploy-embedded-ai

This skill deploys AI models to embedded hardware using MATLAB R2026a and later, covering model creation, import from PyTorch/ONNX/TensorFlow/LiteRT, verification, compression through quantization and pruning, system simulation in Simulink, and C or CUDA code generation for resource-constrained targets like ARM Cortex processors and DSPs. Use it when integrating trained neural networks into embedded systems that require code generation and hardware deployment.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/matlab/matlab-agentic-toolkit /tmp/matlab-deploy-embedded-ai && cp -r /tmp/matlab-deploy-embedded-ai/skills-catalog/ai-and-statistics/matlab-deploy-embedded-ai ~/.claude/skills/matlab-deploy-embedded-ai
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Embedded AI for Engineered Systems

Deploy AI models to embedded hardware using MATLAB® and Simulink®. This skill is
written specifically for **MATLAB R2026a** and uses APIs, functions, and workflows
introduced in that release. It covers the complete lifecycle: model creation or
import, verification, compression, system-level simulation, and code generation
for resource-constrained targets.

Requires MATLAB R2026a or newer. Core toolboxes: Deep Learning Toolbox, Statistics
and Machine Learning Toolbox, MATLAB Coder, Embedded Coder, Simulink, and
Fixed-Point Designer. Workflow-specific support packages are checked during
Environment Discovery. The MATLAB and Simulink Agentic Toolkits must be available
so the agent can drive a live MATLAB and Simulink session through MCP tools.

## When to Use

- Deploying a trained neural network (MATLAB-native or imported) to embedded hardware
- Generating C or CUDA code from a deep learning model for ARM Cortex-M/A/R, DSP, x86, or GPU targets
- Importing PyTorch, ONNX, TensorFlow, or LiteRT models into MATLAB for embedded deployment
- Compressing AI models (quantization, pruning, projection) to fit resource-constrained hardware
- Integrating AI inference into a Simulink system model for closed-loop simulation before code generation
- Using `loadPyTorchExportedProgram`, `importNetworkFromPyTorch`, `importNetworkFromONNX`, `importNetworkFromTensorFlow`, `dlquantizer`, `exportNetworkToSimulink`, or Embedded Coder with AI models
- Choosing between MathWorks-native code generation and direct PyTorch/LiteRT code generation

## When NOT to Use

- Training a model purely for research with no deployment target — use Deep Learning Toolbox documentation directly
- Deploying to cloud/server endpoints (no embedded target) — use MATLAB Production Server or MATLAB Compiler SDK
- Working with classical ML models (decision trees, SVMs, ensembles) that aren't neural networks — use Statistics and Machine Learning Toolbox codegen workflows directly. Note: `fitcnet`/`fitrnet` neural network models ARE covered by this skill
- Generating code for non-AI Simulink models — use standard Embedded Coder workflows
- Converting between model formats without an embedded deployment goal (e.g., ONNX to MATLAB for desktop inference only)

## Workflow Pattern Selection

Determine the correct workflow pattern based on model origin and deployment target.

### Decision Tree

Primary discriminator for 3P models: **model size + hardware class**.

```
Q1: What is the deployment target?
 |
 +-- Cortex-M (M33, M4, M7) ---------------------> Q2
 +-- Cortex-A/R processor or DSP (C2000, etc.) ----> Q2
 +-- x86 processor or GPU (Jetson, CUDA) ----------> Q2
      |
      Q2: Where does the AI model come from?
       |
       +-- Train from scratch in MATLAB ------------> Pattern 1  (references/pattern1/workflow.md)
       +-- Pre-trained 3P model --------------------> Q3
            |
            Q3: Route by hardware class + model size
             |
             +-- Cortex-M: always Pattern 1 import
             |     (MathWorks compression, tight sim-codegen agreement)
             |
             +-- x86 / GPU: Pattern 2 if PyTorch or LiteRT
             |     Pattern 1 import if ONNX/TF (convert to Py/LiteRT recommended)
             |
             +-- Cortex-A/R or DSP:
                   +-- Small model (< 500 KB) ---------> Pattern 1 with import path
                   +-- Large model (> 1 MB):
                        +-- PyTorch / LiteRT -----------> Pattern 2
                        +-- ONNX / TensorFlow ----------> Pattern 1 import *
```

\* Convert to PyTorch&reg; (.pt2) or LiteRT (.tflite) to use Pattern 2 instead.

### Pattern Summary

| Pattern | Model Origin | Target Hardware | Primary Toolchain |
|---------|-------------|-----------------|-------------------|
| **1** | MATLAB-native or 3P imported as dlnetwork | ARM&reg; Cortex&reg;-M (M33, M4, M7), Cortex-A/R, DSP | Embedded Coder&trade; |
| **2** | PyTorch (.pt2) or LiteRT (.tflite) direct code generation | Cortex-A/R, DSP, x86, GPU | MATLAB Coder&trade; + PyTorch & LiteRT SPKG |

### Pattern 1 vs Pattern 2 Capability Comparison

| Capability | Pattern 1 (dlnetwork) | Pattern 2 (PyTorch/LiteRT direct) |
|-----------|----------------------|----------------------|
| C code generation | Yes | Yes |
| Weight inspection / modification | **Yes** | No |
| dlquantizer (INT8) | **Yes** | No |
| Projection (compressNetworkUsingProjection) | **Yes** | No |
| Pruning | **Yes** | No |
| Simulink integration | **Yes** (exportNetworkToSimulink) | **Yes** (PyTorch SPKG Simulink blocks) |
| Fixed-point codegen | **Yes** | No |
| Combined compression (77%+ flash savings) | **Yes** | No |
| Speed to first C code | Slower | **Faster** |
| Requires native rebuild for 3P models | Yes | No |

**Rule of thumb:** Choose Pattern 1 for small models (< 500 KB) on lean hardware
(Cortex-M, DSP) where you need MathWorks compression and tight simulation-codegen
agreement. Choose Pattern 2 for larger models (> 1 MB) on high-performance hardware
(x86, GPU, Cortex-A) where simulation speed is a priority and compression is done
externally in Python. For Cortex-A/R and DSP targets, model size is the primary
discriminator. Pattern 2 supports PyTorch (.pt2) and LiteRT (.tflite) formats.
Both patterns support Simulink integration.

**Stats/ML models (fitrnet/fitcnet):** These follow Pattern 1 but have their own
Simulink integration path. Use the **RegressionNeuralNetwork Predict** block (for
`fitrnet`) or **ClassificationNeuralNetwork Predict** block (for `fitcnet`) from
the Statistics and Machine Learning Toolbox library — do NOT use
`exportNetworkToSimulink` (which is for `dlnetwork` only). Configure simulation
programmatically with `Simulink.SimulationInput`.

## Common Start: Prerequisites

Regardless of pattern, **always** begin with these two prerequisite steps before
entering the pattern-specific phases (which start at Phase 1):

1. **Environment Discov
matlab-train-networkSkill

>

matlab-driving-data-importerSkill

Import recorded driving sensor data (GPS, camera, lidar, actor tracks, lanes) into scenariobuilder.* objects (GPSData, CameraData, LidarData, ActorTrackData, Trajectory, laneData) and run preprocessing — synchronize, offset correction, crop, normalizeTimestamps, convertTimestamps. Also: compute actor tracks from lidar when no annotations exist, attach camera/lidar mounting + intrinsics, export to MAT/workspace/timetable/script. Use for raw driving dataset files (KITTI, nuScenes, Waymo, Pandaset, ROS/ROS2 bags, .mat, .csv, .mp4) or driving/vehicle/sensor logs that need wrapping. drivingLogAnalyzer (DLA) is OPT-IN ONLY — invoke only on explicit user request ('DLA', 'open in DLA', 'inspect/explore/analyze the recording') or reported sensor problem (sync drift, timestamp mismatch, overlay misalignment). NEVER auto-launch DLA after wrapping (Rule 0). For 'build scenario / export to RoadRunner / drivingScenario / OpenSCENARIO / Unreal / simulate', hand off to matlab-scenario-builder.

matlab-scenario-builderSkill

Generate driving scenes, scenarios, road surfaces, and 3D content from already-wrapped scenariobuilder.* sensor data (GPS, camera, lidar, actor tracks) using Scenario Builder for Automated Driving Toolbox. Use to BUILD, EXPORT, or AUGMENT a virtual scenario/scene/map: ego or actor trajectories, trajectory smoothing, OpenCRG road-surface extraction, 3D asset generation, static-object placement, point-cloud georeferencing + elevation, lane-based ego localization, sensor-fusion tracking, scenario-event extraction (cut-ins, hard brakes, near-misses, ADAS disengagements), or export to RoadRunner, drivingScenario, OpenDRIVE, OpenCRG, OpenSCENARIO, or Unreal Engine. Also: log-to-scenario, scenario harvesting, accident/near-miss reconstruction, SOTIF (ISO 21448) and ISO 26262 scenario coverage, USGS-aerial-lidar scene augmentation, traffic-sign placement from camera+lidar logs. NOT for raw-data import or multi-sensor sync/crop/offset/timestamp normalization — route those to matlab-driving-data-importer.

roadrunner-asset-mappingSkill

>

roadrunner-convert-lanelet2-to-rrhdSkill

>

roadrunner-import-sceneSkill

>

roadrunner-rrhd-authoringSkill

>

matlab-build-simbiology-modelSkill

Build, modify, and diagram SimBiology models — API reference, helper functions, and layout patterns. Use when constructing or editing models programmatically or visually.