NVIDIA · janickm · May 6, 2026 · May 6, 2026 · May 6, 2026 · May 6, 2026
diff --git a/docs/conversions/waymo/waymo.rst b/docs/conversions/waymo/waymo.rst
@@ -34,7 +34,11 @@ The Waymo camera frame convention is:
 **Note:** The converter transforms this to NCore's camera convention (principal
 axis +z, x-axis right, y-axis down).
 
-Each camera provides panoptic segmentation data with 29 semantic classes.
+Each camera provides panoptic segmentation data with 29 semantic classes,
+stored via :class:`~ncore.data.v4.CameraLabelsComponent` as ``IMAGE_ENCODED``
+PNG labels (label type: ``SEGMENTATION / "panoptic"``).  The per-label metadata
+includes ``panoptic_label_divisor`` for decoding semantic class and instance IDs
+from the combined panoptic map.
 
 LiDAR Sensors
 ^^^^^^^^^^^^^
@@ -56,7 +60,7 @@ Conversion
 The converter uses NCore V4's component-based architecture. Each sequence is
 parsed from ``.tfrecord`` files and written to NCore format via
 :class:`~ncore.data.v4.SequenceComponentGroupsWriter` with specialized component
-writers for poses, intrinsics, lidar, cameras, masks, and 3D labels.
+writers for poses, intrinsics, lidar, cameras, camera labels, masks, and 3D labels.
 
 Usage
 ^^^^^
@@ -167,6 +171,7 @@ API Reference
 - :class:`~ncore.data.v4.IntrinsicsComponent` - Camera and lidar intrinsics
 - :class:`~ncore.data.v4.LidarSensorComponent` - Lidar frame data
 - :class:`~ncore.data.v4.CameraSensorComponent` - Camera frame data
+- :class:`~ncore.data.v4.CameraLabelsComponent` - Per-camera image labels
 - :class:`~ncore.data.v4.CuboidsComponent` - 3D cuboid track observations
 - :class:`~ncore.data.v4.MasksComponent` - Camera masks
 

diff --git a/docs/data/formats.rst b/docs/data/formats.rst
@@ -57,6 +57,8 @@ types:
   annotations
 * :class:`~ncore.data.v4.PointCloudsComponent` - Pre-computed point clouds with
   optional typed per-point attributes
+* :class:`~ncore.data.v4.CameraLabelsComponent` - Per-camera image-aligned
+  labels (depth, flow, segmentation, masks, normals, features)
 
 The component architecture is extensible, allowing custom component types to be
 defined for application-specific data.
@@ -317,6 +319,88 @@ Lidar and radar point clouds can also be accessed through the unified
 :class:`~ncore.data.PointCloudsSourceProtocol` via the
 :class:`~ncore.data.RayBundleSensorPointCloudsSourceAdapter`.
 
+Camera Labels Component
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Per-camera image-aligned labels (depth maps, optical flow, segmentation, masks,
+surface normals, material properties, feature embeddings) are stored as
+independently-timestamped label instances.  Each instance stores labels of
+**one type** for **one camera**, enabling sparse coverage and multiple label
+sources per camera.
+
+.. code-block:: text
+
+   camera_labels/
+   └── {instance_name}/                         (e.g., "depth.z@front_50fov")
+       │
+       ├── timestamps_us  [N] uint64            (sorted label timestamps)
+       │
+       └── labels/
+            ├── {descriptor}                  
+            │     ├── camera_id: str                   (associated camera identifier)
+            │     ├── label_type: {                    (tagged-union type descriptor)
+            │     │     "category": str,               ("DEPTH", "FLOW", ...)
+            │     │     "qualifier": str,              ("z", "optical_forward", ...)
+            │     │     "unit": str | null             ("METERS", "PIXELS", ...)
+            │     │   }
+            │     ├── label_source: str                ("GT_ANNOTATION", "EXTERNAL", ...)
+            │     ├── label_schema: {                  (storage format descriptor)
+            │     │     "dtype": str,                  (e.g., "float32", "uint8")
+            │     │     "shape_suffix": [int, ...],    (trailing dims after [H, W])
+            │     │     "encoding": str,               ("RAW" | "IMAGE_ENCODED")
+            │     │     "encoded_format": str | null,  ("png", "jpeg", null)
+            │     │     "quantization": {...} | null   (optional dequant params)
+            │     │   }
+            │     └── generic_meta_data: {...}         (metadata common for all labels)
+            │
+            └── {timestamp_us}/                  (keyed by camera end-of-frame timestamp)
+                ├── data  [H, W, ...] or |Sx     (label array or encoded bytes)
+                └── {attrs}
+                    ├── generic_meta_data: {...} (per-label metadata)
+                    └── format: str              (IMAGE_ENCODED only)
+
+**Label Type System:**
+
+Labels use a *tagged-union* type consisting of a high-level
+:class:`~ncore.data.LabelCategory` enum and a free-form qualifier string.
+Well-known types are provided as constants (e.g., ``LabelType.DEPTH_Z_M``,
+``LabelType.SEGMENTATION_SEMANTIC``), while project-specific labels use custom
+qualifiers without any code changes.
+
+Supported categories:
+
+* ``DEPTH`` -- Per-pixel distance measures (``"z"``, ``"ray"``, ``"relative"``, ...)
+* ``FLOW`` -- Motion displacement fields (``"optical_forward"``, ``"scene_backward"``, ...)
+* ``SEGMENTATION`` -- Per-pixel classification (``"semantic"``, ``"instance"``, ``"logits"``)
+* ``MASK`` -- Binary or multi-level masks (``"background"``, ``"dynamic"``, ``"ego"``, ...)
+* ``GEOMETRY`` -- Per-pixel geometric vectors (``"normal_camera"``, ``"ray_direction"``, ...)
+* ``MATERIAL`` -- Surface material properties (``"albedo"``, ``"roughness"``, ...)
+* ``FEATURE`` -- Per-pixel feature embeddings (``"dinov2"``, ``"clip"``, ...)
+* ``OTHER`` -- Catch-all for uncategorised labels
+
+**Encoding:**
+
+* ``RAW`` -- Numpy array stored as a zarr dataset regular compression. Shape
+  is ``[H, W] + shape_suffix`` (e.g., ``[H, W, 2]`` for optical flow).
+  Transparent quantization of raw labels is supported optionally
+  (e.g., float32 depth quantized to uint16 with scale/offset).
+* ``IMAGE_ENCODED`` -- Pre-encoded image bytes (PNG, JPEG) stored as a 1-D
+  zarr uint8 dataset with no compression. Consumers can call ``get_encoded_data()`` for raw
+  bytes (GPU-based decoding) or ``get_data()`` for Pillow-decoded numpy arrays.
+
+**Instance naming convention:**
+
+Instance names are opaque identifiers.  The recommended convention is
+``category.qualifier@camera_id`` (e.g., ``depth.z@front_50fov``).  The
+component does *not* parse or validate instance names.
+
+**Compat layer access:**
+
+Labels are accessed through :class:`~ncore.data.CameraLabelsProtocol` via
+:meth:`~ncore.data.SequenceLoaderProtocol.get_camera_labels` (by ID) or
+:meth:`~ncore.data.SequenceLoaderProtocol.query_camera_labels` (by camera
+and optional type/category filter).
+
 Component Groups
 ~~~~~~~~~~~~~~~~
 

diff --git a/docs/tools/ncore_vis.rst b/docs/tools/ncore_vis.rst
@@ -180,6 +180,42 @@ from all cameras in the sequence and shown in a shared dropdown.  The opacity
 of the tint is adjustable.  Masks are boolean images stored per-sensor and
 are not per-frame.
 
+Camera Labels Overlay
+^^^^^^^^^^^^^^^^^^^^^
+
+When a sequence contains :class:`~ncore.data.v4.CameraLabelsComponent` data,
+a **Camera Labels** overlay section appears in the Cameras tab.  This allows
+visualizing per-frame labels (depth maps, segmentation masks, surface normals,
+etc.) on camera images.
+
+Global controls:
+
+- **Show Labels**: master toggle to enable camera label visualization
+  (disabled by default)
+- **Matching**: timestamp matching strategy for sparse labels
+
+  - *Closest*: always picks the nearest available label timestamp (default)
+  - *Exact*: only shows a label if one exists at the camera frame's exact
+    timestamp; otherwise shows the RGB image
+
+- **Opacity**: blend factor between RGB image and label visualization
+  (0.0 = full RGB, 1.0 = full label)
+
+Per-camera controls:
+
+- **Label**: dropdown showing only labels associated with this camera,
+  populated from :meth:`~ncore.data.SequenceLoaderProtocol.query_camera_labels`.
+  Defaults to the first available label so that toggling "Show Labels" on
+  immediately renders something.
+
+Visualization adapts to the label category:
+
+- **DEPTH** -- TURBO colormap with percentile-based normalization
+- **SEGMENTATION** -- 20-color class palette
+- **MASK** -- green tint overlay (3-channel masks rendered as RGB)
+- **GEOMETRY** -- normal vectors mapped to RGB ([-1,1] → [0,255])
+- **Other categories** -- grayscale normalization
+
 Lidar Point Clouds
 ^^^^^^^^^^^^^^^^^^
 

diff --git a/ncore/data/__init__.py b/ncore/data/__init__.py
@@ -16,6 +16,7 @@
 """Package exposing methods related to NCore's basic data types and abstract APIs"""
 
 from ncore.impl.data.compat import (
+    CameraLabelsProtocol,
     CameraSensorProtocol,
     LidarSensorProtocol,
     PointCloudsSourceProtocol,
@@ -27,6 +28,7 @@
 from ncore.impl.data.types import (
     BBox3,
     BivariateWindshieldModelParameters,
+    CameraLabelDescriptor,
     ConcreteCameraModelParametersUnion,
     ConcreteExternalDistortionParametersUnion,
     ConcreteLidarModelParametersUnion,
@@ -35,10 +37,16 @@
     EncodedImageHandle,
     FrameTimepoint,
     FThetaCameraModelParameters,
+    LabelCategory,
+    LabelEncoding,
+    LabelSchema,
     LabelSource,
+    LabelType,
+    LabelUnit,
     OpenCVFisheyeCameraModelParameters,
     OpenCVPinholeCameraModelParameters,
     PointCloud,
+    QuantizationParams,
     ReferencePolynomial,
     RowOffsetStructuredSpinningLidarModelParameters,
     ShutterType,
@@ -64,6 +72,13 @@
     "ConcreteExternalDistortionParametersUnion",
     "ConcreteLidarModelParametersUnion",
     "PointCloud",
+    "LabelCategory",
+    "LabelUnit",
+    "LabelEncoding",
+    "LabelType",
+    "QuantizationParams",
+    "LabelSchema",
+    "CameraLabelDescriptor",
     # compat protocols
     "SequenceLoaderProtocol",
     "SensorProtocol",
@@ -72,4 +87,5 @@
     "RadarSensorProtocol",
     "PointCloudsSourceProtocol",
     "RayBundleSensorPointCloudsSourceAdapter",
+    "CameraLabelsProtocol",
 ]
diff --git a/ncore/data/v4/__init__.py b/ncore/data/v4/__init__.py
@@ -17,6 +17,7 @@
 
 from ncore.impl.data.v4.compat import SequenceLoaderV4
 from ncore.impl.data.v4.components import (
+    CameraLabelsComponent,
     CameraSensorComponent,
     ComponentReader,
     ComponentWriter,
@@ -46,6 +47,7 @@
     "RadarSensorComponent",
     "CuboidsComponent",
     "PointCloudsComponent",
+    "CameraLabelsComponent",
     # compat APIs
     "SequenceLoaderV4",
 ]
diff --git a/ncore/impl/data/compat.py b/ncore/impl/data/compat.py
@@ -62,12 +62,15 @@
 
 from ncore.impl.common.transformations import HalfClosedInterval, PoseGraphInterpolator
 from ncore.impl.data.types import (
+    CameraLabelDescriptor,
     ConcreteCameraModelParametersUnion,
     ConcreteLidarModelParametersUnion,
     CuboidTrackObservation,
     EncodedImageData,
     FrameTimepoint,
     JsonLike,
+    LabelCategory,
+    LabelType,
     PointCloud,
 )
 from ncore.impl.data.util import closest_index_sorted
@@ -166,6 +169,34 @@ def get_point_clouds_source(self, source_id: str, *, return_index: int = 0) -> P
         """
         ...
 
+    @property
+    def camera_labels_ids(self) -> List[str]:
+        """List of all camera label instance IDs."""
+        ...
+
+    def get_camera_labels(self, camera_label_id: str) -> CameraLabelsProtocol:
+        """Get a camera label instance by instance ID."""
+        ...
+
+    def query_camera_labels(
+        self,
+        camera_id: str,
+        label_type: Optional[LabelType] = None,
+        label_category: Optional[LabelCategory] = None,
+    ) -> List[CameraLabelsProtocol]:
+        """Query camera label instances matching filters.
+
+        Parameters
+        ----------
+        camera_id
+            Camera ID to match.
+        label_type
+            If provided, only return sources with this exact label type.
+        label_category
+            If provided, only return sources whose label type category matches.
+        """
+        ...
+
 
 @runtime_checkable
 class SensorProtocol(Protocol):
@@ -611,7 +642,7 @@ def get_pc_generic_data(self, pc_index: int, name: str) -> npt.NDArray:
         ...
 
     def get_pc_generic_meta_data(self, pc_index: int) -> Dict[str, JsonLike]:
-        """Return generic JSON metadata associated with the given point cloud."""
+        """Returns generic point cloud meta-data for a specific point-cloud."""
         ...
 
     def get_pc_index_range(
@@ -816,3 +847,61 @@ def get_pc_index_range(
         step: Optional[int] = None,
     ) -> range:
         return range(*slice(start, stop, step).indices(self.pcs_count))
+
+
+@runtime_checkable
+class CameraLabelsProtocol(Protocol):
+    """Protocol for accessing camera-associated image labels.
+
+    Each instance provides labels of one type for one camera, with independently-managed timestamps.
+    """
+
+    @property
+    def label_descriptor(self) -> CameraLabelDescriptor:
+        """Descriptor of this label instance."""
+        ...
+
+    @property
+    def labels_count(self) -> int:
+        """Number of stored labels."""
+        ...
+
+    @property
+    def label_timestamps_us(self) -> npt.NDArray[np.uint64]:
+        """Timestamps of all stored labels, sorted ascending."""
+        ...
+
+    @property
+    def labels_generic_meta_data(self) -> Dict[str, JsonLike]:
+        """Generic metadata associated with all labels."""
+        ...
+
+    @runtime_checkable
+    class CameraLabelHandleProtocol(Protocol):
+        """Protocol for a single camera label at a specific timestamp.
+
+        Returned by :meth:`CameraLabelsProtocol.get_label` and provides deferred
+        access to the label data and metadata.
+        """
+
+        @property
+        def timestamp_us(self) -> int:
+            """Timestamp of this label in microseconds (usually associated with the camera end-of-frame timestamp)."""
+            ...
+
+        @property
+        def generic_meta_data(self) -> Dict[str, JsonLike]:
+            """Per-label generic metadata."""
+            ...
+
+        def get_data(self) -> npt.NDArray[Any]:
+            """Load and return the label data as a numpy array."""
+            ...
+
+        def get_encoded_data(self) -> Optional[bytes]:
+            """Return raw encoded bytes for IMAGE_ENCODED labels, or None for RAW labels."""
+            ...
+
+    def get_label(self, timestamp_us: int) -> CameraLabelHandleProtocol:
+        """Return a lazy handle to the label data at the given timestamp."""
+        ...