matlab-connect-databricks-jdbc
This skill establishes JDBC connections from MATLAB to Databricks clusters or SQL Warehouses using Database Toolbox and the MATLAB Interface for Databricks package. Use it when configuring Databricks authentication methods (PAT, OauthU2M, OauthM2M), selecting between Simba and OSS JDBC drivers, creating connections via databricks.JDBCConnection or StandaloneJDBCConnection classes, or optimizing write performance for data transfers to Databricks.
git clone --depth 1 https://github.com/matlab/matlab-agentic-toolkit /tmp/matlab-connect-databricks-jdbc && cp -r /tmp/matlab-connect-databricks-jdbc/skills-catalog/reporting-and-database-access/matlab-connect-databricks-jdbc ~/.claude/skills/matlab-connect-databricks-jdbcSKILL.md
# Connect MATLAB to Databricks via JDBC Use when establishing a JDBC connection from MATLAB to Databricks using Database Toolbox and the [MATLAB Interface for Databricks](https://www.mathworks.com/solutions/partners/databricks.html) package. This skill covers connection class selection, authentication configuration, driver setup, and connection optimization. Once connected, use standard Database Toolbox functions (`sqlread`, `fetch`, `sqlwrite`, `execute`) on the `j.Connection` object for data operations. ## Prerequisites - [MATLAB Interface for Databricks](https://www.mathworks.com/solutions/partners/databricks.html) package installed and on the MATLAB path - Database Toolbox installed ## When to Use - Connecting MATLAB to a Databricks cluster via JDBC - Connecting MATLAB to a Databricks SQL Warehouse via JDBC - Configuring Databricks authentication (PAT, OauthU2M, OauthM2M) - Setting up the Databricks JDBC driver (Simba or OSS) - Creating a standalone JDBC connection without the full Databricks package - Running MATLAB on a Databricks cluster and connecting back via JDBC - Optimizing write performance for large data transfers to Databricks - User mentions keywords: Databricks, JDBCConnection, SQL Warehouse, databricks.JDBCConnection, StandaloneJDBCConnection, Databricks JDBC, Databricks connect, Databricks cluster, Databricks authentication ## When NOT to Use - Connecting via ODBC (use `databricks.ODBCConnection` directly) - Using Databricks Connect (Python-based Spark API, not Database Toolbox) - Using the Databricks REST APIs (Clusters, Jobs, DBFS, etc.) - Using MLflow from MATLAB (separate module in the package) - Executing SQL via the Statement Execution REST API (not JDBC) ## Critical Rules ### Connection - **ALWAYS** prefer `databricks.JDBCConnection` or `StandaloneJDBCConnection` over manually constructing a JDBC URL with `database()`. Manual URL construction is not wrong, but the URL format is complex and error-prone, and may expose connection details in source code. The connection classes handle URL construction, driver classpath, and authentication automatically. - **ALWAYS** use `StandaloneJDBCConnection` when the user does not have the MATLAB Interface for Databricks package installed. Never fall back to manual `database()` with JDBC URL construction. - **ALWAYS** call `close(j)` or `close(j.Connection)` when the connection is no longer needed. - **ALWAYS** verify a connection succeeded by checking that `j.Connection.Message` is empty. A non-empty message indicates a driver error. ### Authentication - **ALWAYS** let the connection class handle authentication via the unified provider chain. The default method is OauthU2M. Do not hardcode tokens in source code. - **NEVER** use the JDBC driver's built-in OauthU2M when running MATLAB on Databricks in a browser. The driver attempts to open a browser window, which fails. Use `useDriverAuth=false` instead. ### Drivers - The **Simba driver** (non-OSS, v2.7.3 to <3.0.0) is the default and ships with the package. It works with MATLAB's default Java 8 environment. - The **OSS driver** (v3.0.3+) requires Java 11 or greater. Set MATLAB's Java environment with `jenv` before using it. The OSS driver uses Arrow for faster large data transfers. - `StandaloneJDBCConnection` supports the Simba driver only. Do not use `useDriverType="oss"` with standalone connections. ## Function Reference | Function / Class | Purpose | When to Use | |-----------------|---------|-------------| | `databricks.JDBCConnection` | Creates a JDBC connection with full package support | Default choice when the MATLAB Interface for Databricks package is installed | | `StandaloneJDBCConnection` | Creates a JDBC connection with zero package dependencies | When embedding Databricks connectivity in a standalone codebase | | `databricks.SQLWarehouse.connect()` | Connects to a SQL Warehouse by ID; returns a `database.jdbc.connection` directly | When targeting a SQL Warehouse instead of a cluster | | `j.Connection` | The underlying `database.jdbc.connection` object from `JDBCConnection` or `StandaloneJDBCConnection` | Pass this to `sqlread`, `fetch`, `sqlwrite`, `execute`, etc. Equivalent to `conn` created using `database()` in Database Toolbox | | `databricks.internal.isOnDatabricks()` | Returns `true` if MATLAB is running on a Databricks cluster | Use to branch connection logic for on-cluster vs off-cluster scenarios | | `j.testConnection()` | Verifies the connection is working | After creating a connection to confirm success | | `j.saveSource()` | Saves connection as a Database Toolbox data source | When using Database Explorer app for interactive exploration | | `j.copyToken()` | Copies the auth token to clipboard | When Database Explorer prompts for credentials | | `close(j)` | Closes the connection and releases resources | When done with the connection | ## Decision Framework ### Which class should I use? | Scenario | Class | Why | |----------|-------|-----| | Full package installed, targeting a cluster | `databricks.JDBCConnection` | Handles auth, URL, driver classpath automatically | | Targeting a specific compute endpoint by HTTP path | `databricks.JDBCConnection(httpPath="/sql/1.0/warehouses/abc")` | Overrides the default cluster routing | | Full package installed, targeting a SQL Warehouse | `databricks.SQLWarehouse.connect()` | Builds connection from warehouse metadata via REST API | | No package installed or standalone integration | `StandaloneJDBCConnection` | Zero dependencies on the Databricks package | | Already have a `databricks.Cluster` object | `databricks.JDBCConnection(cluster=myCluster)` | Routes connection to a specific cluster object | | MATLAB running on a Databricks cluster | `databricks.JDBCConnection(authMethod="OauthU2M", useDriverAuth=false)` | Driver browser auth does not work in-browser MATLAB | ### Which JDBC driver should I use? | Scenario | Driver | Notes | |----------|--------|-------| | Java 8 (MATLAB default) | Simba (d
>
Import recorded driving sensor data (GPS, camera, lidar, actor tracks, lanes) into scenariobuilder.* objects (GPSData, CameraData, LidarData, ActorTrackData, Trajectory, laneData) and run preprocessing — synchronize, offset correction, crop, normalizeTimestamps, convertTimestamps. Also: compute actor tracks from lidar when no annotations exist, attach camera/lidar mounting + intrinsics, export to MAT/workspace/timetable/script. Use for raw driving dataset files (KITTI, nuScenes, Waymo, Pandaset, ROS/ROS2 bags, .mat, .csv, .mp4) or driving/vehicle/sensor logs that need wrapping. drivingLogAnalyzer (DLA) is OPT-IN ONLY — invoke only on explicit user request ('DLA', 'open in DLA', 'inspect/explore/analyze the recording') or reported sensor problem (sync drift, timestamp mismatch, overlay misalignment). NEVER auto-launch DLA after wrapping (Rule 0). For 'build scenario / export to RoadRunner / drivingScenario / OpenSCENARIO / Unreal / simulate', hand off to matlab-scenario-builder.
Generate driving scenes, scenarios, road surfaces, and 3D content from already-wrapped scenariobuilder.* sensor data (GPS, camera, lidar, actor tracks) using Scenario Builder for Automated Driving Toolbox. Use to BUILD, EXPORT, or AUGMENT a virtual scenario/scene/map: ego or actor trajectories, trajectory smoothing, OpenCRG road-surface extraction, 3D asset generation, static-object placement, point-cloud georeferencing + elevation, lane-based ego localization, sensor-fusion tracking, scenario-event extraction (cut-ins, hard brakes, near-misses, ADAS disengagements), or export to RoadRunner, drivingScenario, OpenDRIVE, OpenCRG, OpenSCENARIO, or Unreal Engine. Also: log-to-scenario, scenario harvesting, accident/near-miss reconstruction, SOTIF (ISO 21448) and ISO 26262 scenario coverage, USGS-aerial-lidar scene augmentation, traffic-sign placement from camera+lidar logs. NOT for raw-data import or multi-sensor sync/crop/offset/timestamp normalization — route those to matlab-driving-data-importer.
>
>
>
>
Build, modify, and diagram SimBiology models — API reference, helper functions, and layout patterns. Use when constructing or editing models programmatically or visually.