matlab-deploy-embedded-ai
This skill deploys AI models to embedded hardware using MATLAB R2026a and later, covering model creation, import from PyTorch/ONNX/TensorFlow/LiteRT, verification, compression through quantization and pruning, system simulation in Simulink, and C or CUDA code generation for resource-constrained targets like ARM Cortex processors and DSPs. Use it when integrating trained neural networks into embedded systems that require code generation and hardware deployment.
git clone --depth 1 https://github.com/matlab/matlab-agentic-toolkit /tmp/matlab-deploy-embedded-ai && cp -r /tmp/matlab-deploy-embedded-ai/skills-catalog/ai-and-statistics/matlab-deploy-embedded-ai ~/.claude/skills/matlab-deploy-embedded-aiSKILL.md
# Embedded AI for Engineered Systems
Deploy AI models to embedded hardware using MATLAB® and Simulink®. This skill is
written specifically for **MATLAB R2026a** and uses APIs, functions, and workflows
introduced in that release. It covers the complete lifecycle: model creation or
import, verification, compression, system-level simulation, and code generation
for resource-constrained targets.
Requires MATLAB R2026a or newer. Core toolboxes: Deep Learning Toolbox, Statistics
and Machine Learning Toolbox, MATLAB Coder, Embedded Coder, Simulink, and
Fixed-Point Designer. Workflow-specific support packages are checked during
Environment Discovery. The MATLAB and Simulink Agentic Toolkits must be available
so the agent can drive a live MATLAB and Simulink session through MCP tools.
## When to Use
- Deploying a trained neural network (MATLAB-native or imported) to embedded hardware
- Generating C or CUDA code from a deep learning model for ARM Cortex-M/A/R, DSP, x86, or GPU targets
- Importing PyTorch, ONNX, TensorFlow, or LiteRT models into MATLAB for embedded deployment
- Compressing AI models (quantization, pruning, projection) to fit resource-constrained hardware
- Integrating AI inference into a Simulink system model for closed-loop simulation before code generation
- Using `loadPyTorchExportedProgram`, `importNetworkFromPyTorch`, `importNetworkFromONNX`, `importNetworkFromTensorFlow`, `dlquantizer`, `exportNetworkToSimulink`, or Embedded Coder with AI models
- Choosing between MathWorks-native code generation and direct PyTorch/LiteRT code generation
## When NOT to Use
- Training a model purely for research with no deployment target — use Deep Learning Toolbox documentation directly
- Deploying to cloud/server endpoints (no embedded target) — use MATLAB Production Server or MATLAB Compiler SDK
- Working with classical ML models (decision trees, SVMs, ensembles) that aren't neural networks — use Statistics and Machine Learning Toolbox codegen workflows directly. Note: `fitcnet`/`fitrnet` neural network models ARE covered by this skill
- Generating code for non-AI Simulink models — use standard Embedded Coder workflows
- Converting between model formats without an embedded deployment goal (e.g., ONNX to MATLAB for desktop inference only)
## Workflow Pattern Selection
Determine the correct workflow pattern based on model origin and deployment target.
### Decision Tree
Primary discriminator for 3P models: **model size + hardware class**.
```
Q1: What is the deployment target?
|
+-- Cortex-M (M33, M4, M7) ---------------------> Q2
+-- Cortex-A/R processor or DSP (C2000, etc.) ----> Q2
+-- x86 processor or GPU (Jetson, CUDA) ----------> Q2
|
Q2: Where does the AI model come from?
|
+-- Train from scratch in MATLAB ------------> Pattern 1 (references/pattern1/workflow.md)
+-- Pre-trained 3P model --------------------> Q3
|
Q3: Route by hardware class + model size
|
+-- Cortex-M: always Pattern 1 import
| (MathWorks compression, tight sim-codegen agreement)
|
+-- x86 / GPU: Pattern 2 if PyTorch or LiteRT
| Pattern 1 import if ONNX/TF (convert to Py/LiteRT recommended)
|
+-- Cortex-A/R or DSP:
+-- Small model (< 500 KB) ---------> Pattern 1 with import path
+-- Large model (> 1 MB):
+-- PyTorch / LiteRT -----------> Pattern 2
+-- ONNX / TensorFlow ----------> Pattern 1 import *
```
\* Convert to PyTorch® (.pt2) or LiteRT (.tflite) to use Pattern 2 instead.
### Pattern Summary
| Pattern | Model Origin | Target Hardware | Primary Toolchain |
|---------|-------------|-----------------|-------------------|
| **1** | MATLAB-native or 3P imported as dlnetwork | ARM® Cortex®-M (M33, M4, M7), Cortex-A/R, DSP | Embedded Coder™ |
| **2** | PyTorch (.pt2) or LiteRT (.tflite) direct code generation | Cortex-A/R, DSP, x86, GPU | MATLAB Coder™ + PyTorch & LiteRT SPKG |
### Pattern 1 vs Pattern 2 Capability Comparison
| Capability | Pattern 1 (dlnetwork) | Pattern 2 (PyTorch/LiteRT direct) |
|-----------|----------------------|----------------------|
| C code generation | Yes | Yes |
| Weight inspection / modification | **Yes** | No |
| dlquantizer (INT8) | **Yes** | No |
| Projection (compressNetworkUsingProjection) | **Yes** | No |
| Pruning | **Yes** | No |
| Simulink integration | **Yes** (exportNetworkToSimulink) | **Yes** (PyTorch SPKG Simulink blocks) |
| Fixed-point codegen | **Yes** | No |
| Combined compression (77%+ flash savings) | **Yes** | No |
| Speed to first C code | Slower | **Faster** |
| Requires native rebuild for 3P models | Yes | No |
**Rule of thumb:** Choose Pattern 1 for small models (< 500 KB) on lean hardware
(Cortex-M, DSP) where you need MathWorks compression and tight simulation-codegen
agreement. Choose Pattern 2 for larger models (> 1 MB) on high-performance hardware
(x86, GPU, Cortex-A) where simulation speed is a priority and compression is done
externally in Python. For Cortex-A/R and DSP targets, model size is the primary
discriminator. Pattern 2 supports PyTorch (.pt2) and LiteRT (.tflite) formats.
Both patterns support Simulink integration.
**Stats/ML models (fitrnet/fitcnet):** These follow Pattern 1 but have their own
Simulink integration path. Use the **RegressionNeuralNetwork Predict** block (for
`fitrnet`) or **ClassificationNeuralNetwork Predict** block (for `fitcnet`) from
the Statistics and Machine Learning Toolbox library — do NOT use
`exportNetworkToSimulink` (which is for `dlnetwork` only). Configure simulation
programmatically with `Simulink.SimulationInput`.
## Common Start: Prerequisites
Regardless of pattern, **always** begin with these two prerequisite steps before
entering the pattern-specific phases (which start at Phase 1):
1. **Environment Discov>
Import recorded driving sensor data (GPS, camera, lidar, actor tracks, lanes) into scenariobuilder.* objects (GPSData, CameraData, LidarData, ActorTrackData, Trajectory, laneData) and run preprocessing — synchronize, offset correction, crop, normalizeTimestamps, convertTimestamps. Also: compute actor tracks from lidar when no annotations exist, attach camera/lidar mounting + intrinsics, export to MAT/workspace/timetable/script. Use for raw driving dataset files (KITTI, nuScenes, Waymo, Pandaset, ROS/ROS2 bags, .mat, .csv, .mp4) or driving/vehicle/sensor logs that need wrapping. drivingLogAnalyzer (DLA) is OPT-IN ONLY — invoke only on explicit user request ('DLA', 'open in DLA', 'inspect/explore/analyze the recording') or reported sensor problem (sync drift, timestamp mismatch, overlay misalignment). NEVER auto-launch DLA after wrapping (Rule 0). For 'build scenario / export to RoadRunner / drivingScenario / OpenSCENARIO / Unreal / simulate', hand off to matlab-scenario-builder.
Generate driving scenes, scenarios, road surfaces, and 3D content from already-wrapped scenariobuilder.* sensor data (GPS, camera, lidar, actor tracks) using Scenario Builder for Automated Driving Toolbox. Use to BUILD, EXPORT, or AUGMENT a virtual scenario/scene/map: ego or actor trajectories, trajectory smoothing, OpenCRG road-surface extraction, 3D asset generation, static-object placement, point-cloud georeferencing + elevation, lane-based ego localization, sensor-fusion tracking, scenario-event extraction (cut-ins, hard brakes, near-misses, ADAS disengagements), or export to RoadRunner, drivingScenario, OpenDRIVE, OpenCRG, OpenSCENARIO, or Unreal Engine. Also: log-to-scenario, scenario harvesting, accident/near-miss reconstruction, SOTIF (ISO 21448) and ISO 26262 scenario coverage, USGS-aerial-lidar scene augmentation, traffic-sign placement from camera+lidar logs. NOT for raw-data import or multi-sensor sync/crop/offset/timestamp normalization — route those to matlab-driving-data-importer.
>
>
>
>
Build, modify, and diagram SimBiology models — API reference, helper functions, and layout patterns. Use when constructing or editing models programmatically or visually.