Skill267 estrellas del repoactualizado 18d ago
robot-perception
The robot-perception skill provides configuration, calibration, and processing pipelines for robot sensors including cameras, LiDAR, and depth sensors. It covers camera intrinsic and extrinsic calibration, RGB and point cloud processing, object detection and tracking, multi-sensor fusion and synchronization, model deployment on robot hardware, and debugging of perception failures like latency and frame drops.
Instalar en Claude Code
Copiargit clone --depth 1 https://github.com/arpitg1304/robotics-agent-skills /tmp/robot-perception && cp -r /tmp/robot-perception/skills/robot-perception ~/.claude/skills/robot-perceptionDespués abre una sesión nueva de Claude Code; el skill carga automáticamente.
Definición
SKILL.md
# Robot Perception Skill
## When to Use This Skill
- Setting up and configuring camera, LiDAR, or depth sensors
- Building RGB, depth, or point cloud processing pipelines
- Calibrating cameras (intrinsic, extrinsic, hand-eye)
- Implementing object detection, segmentation, or tracking for robots
- Fusing data from multiple sensor modalities
- Streaming sensor data with proper threading and buffering
- Synchronizing multi-sensor rigs
- Deploying perception models on robot hardware (GPU, edge)
- Debugging perception failures (latency, dropped frames, misalignment)
## Sensor Landscape
### Sensor Types and Characteristics
```
Sensor Type Output Range Rate Best For
─────────────────────────────────────────────────────────────────────────
RGB Camera (H,W,3) uint8 ∞ 30-120Hz Object detection, tracking, visual servoing
Stereo Camera (H,W,3)+(H,W,3) 0.3-20m 30-90Hz Dense depth from passive stereo
Structured Light (H,W) float + RGB 0.2-10m 30Hz Indoor manipulation, short range
ToF Depth (H,W) float + RGB 0.1-10m 30Hz Indoor, medium range
LiDAR (spinning) (N,3) or (N,4) 0.5-200m 10-20Hz Outdoor navigation, mapping
LiDAR (solid-st.) (N,3) 0.5-200m 10-30Hz Automotive, outdoor
IMU (6,) or (9,) N/A 200-1kHz Orientation, motion estimation
Force/Torque (6,) float N/A 1kHz+ Contact detection, force control
Tactile (H,W) or (N,3) Contact 30-100Hz Grasp quality, texture
Event Camera Events (x,y,t,p) ∞ μs High-speed tracking, HDR scenes
```
### Common Sensor Hardware
```
Device Type SDK/Driver ROS2 Package
──────────────────────────────────────────────────────────────────────────
Intel RealSense Structured Light pyrealsense2 realsense2_camera
Stereolabs ZED Stereo + IMU pyzed zed_wrapper
Luxonis OAK-D Stereo + Neural depthai depthai_ros
FLIR/Basler Industrial RGB PySpin/pypylon spinnaker_camera_driver
Velodyne Spinning LiDAR velodyne_driver velodyne
Ouster Spinning LiDAR ouster-sdk ros2_ouster
Livox Solid-state LiDAR livox_sdk livox_ros2_driver
USB Webcam RGB OpenCV VideoCapture usb_cam / v4l2_camera
```
## Camera Models and Calibration
### Pinhole Camera Model
```
3D World Point (X, Y, Z)
|
[R | t] — Extrinsic (world → camera)
|
Camera Point (Xc, Yc, Zc)
|
K — Intrinsic (camera → pixel)
|
Pixel (u, v)
K = [ fx 0 cx ] fx, fy = focal lengths (pixels)
[ 0 fy cy ] cx, cy = principal point
[ 0 0 1 ]
Projection: [u, v, 1]^T = K @ [R | t] @ [X, Y, Z, 1]^T
```
### Intrinsic Calibration
```python
import cv2
import numpy as np
from pathlib import Path
class IntrinsicCalibrator:
"""Camera intrinsic calibration using checkerboard pattern"""
def __init__(self, board_size=(9, 6), square_size_m=0.025):
self.board_size = board_size
self.square_size = square_size_m
# Prepare object points (3D coordinates of checkerboard corners)
self.objp = np.zeros((board_size[0] * board_size[1], 3), np.float32)
self.objp[:, :2] = np.mgrid[
0:board_size[0], 0:board_size[1]
].T.reshape(-1, 2) * square_size_m
def collect_calibration_images(self, camera, num_images=30,
min_coverage=0.6):
"""Collect calibration images with good spatial coverage.
IMPORTANT: Move the board to cover all regions of the image,
including corners and edges. Tilt the board at various angles.
Bad coverage = bad calibration, especially at image edges.
"""
obj_points = []
img_points = []
coverage_map = np.zeros((4, 4), dtype=int) # Track board positions
while len(obj_points) < num_images:
frame = camera.capture()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
found, corners = cv2.findChessboardCorners(
gray, self.board_size,
cv2.CALIB_CB_ADAPTIVE_THRESH |
cv2.CALIB_CB_NORMALIZE_IMAGE |
cv2.CALIB_CB_FAST_CHECK
)
if found:
# Sub-pixel refinement — critical for accuracy
criteria = (cv2.TERM_CRITERIA_EPS +
cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
corners = cv2.cornerSubPix(
gray, corners, (11, 11), (-1, -1), criteria)
# Track coverage
center = corners.mean(axis=0).flatten()
grid_x = int(center[0] / gray.shape[1] * 4)
grid_y = int(center[1] / gray.shape[0] * 4)
grid_x = min(grid_x, 3)
grid_y = min(grid_y, 3)
coverage_map[grid_y, grid_x] += 1
obj_points.append(self.objp)
img_points.append(corners)
coverage = (coverage_map > 0).sum() / coverage_map.size
if coverage < min_coverage:
print(f"WARNING: Only {coverage:.0%} coverage. "
f"Move board to uncovered regions.")
return obj_points, img_points, gray.shape[::-1]
def calibrate(self, obj_points, img_points, image_size):
"""Run calibration and return camera matrix + distortion coeffs"""
ret, K, dist, rvecs, tvecs = cv2.calibrateCamera(
obj_points, img_points, image_size, None, None)
if ret > 1.0:
print(f"WARNING: High reprojection error ({ret:.3f} px). "
f"Check image quality and board dete