Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions docs/conversions/waymo/waymo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,11 @@ The Waymo camera frame convention is:
**Note:** The converter transforms this to NCore's camera convention (principal
axis +z, x-axis right, y-axis down).

Each camera provides panoptic segmentation data with 29 semantic classes.
Each camera provides panoptic segmentation data with 29 semantic classes,
stored via :class:`~ncore.data.v4.CameraLabelsComponent` as ``IMAGE_ENCODED``
PNG labels (label type: ``SEGMENTATION / "panoptic"``). The per-label metadata
includes ``panoptic_label_divisor`` for decoding semantic class and instance IDs
from the combined panoptic map.

LiDAR Sensors
^^^^^^^^^^^^^
Expand All @@ -56,7 +60,7 @@ Conversion
The converter uses NCore V4's component-based architecture. Each sequence is
parsed from ``.tfrecord`` files and written to NCore format via
:class:`~ncore.data.v4.SequenceComponentGroupsWriter` with specialized component
writers for poses, intrinsics, lidar, cameras, masks, and 3D labels.
writers for poses, intrinsics, lidar, cameras, camera labels, masks, and 3D labels.

Usage
^^^^^
Expand Down Expand Up @@ -167,6 +171,7 @@ API Reference
- :class:`~ncore.data.v4.IntrinsicsComponent` - Camera and lidar intrinsics
- :class:`~ncore.data.v4.LidarSensorComponent` - Lidar frame data
- :class:`~ncore.data.v4.CameraSensorComponent` - Camera frame data
- :class:`~ncore.data.v4.CameraLabelsComponent` - Per-camera image labels
- :class:`~ncore.data.v4.CuboidsComponent` - 3D cuboid track observations
- :class:`~ncore.data.v4.MasksComponent` - Camera masks

Expand Down
84 changes: 84 additions & 0 deletions docs/data/formats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@ types:
annotations
* :class:`~ncore.data.v4.PointCloudsComponent` - Pre-computed point clouds with
optional typed per-point attributes
* :class:`~ncore.data.v4.CameraLabelsComponent` - Per-camera image-aligned
labels (depth, flow, segmentation, masks, normals, features)

The component architecture is extensible, allowing custom component types to be
defined for application-specific data.
Expand Down Expand Up @@ -317,6 +319,88 @@ Lidar and radar point clouds can also be accessed through the unified
:class:`~ncore.data.PointCloudsSourceProtocol` via the
:class:`~ncore.data.RayBundleSensorPointCloudsSourceAdapter`.

Camera Labels Component
~~~~~~~~~~~~~~~~~~~~~~~

Per-camera image-aligned labels (depth maps, optical flow, segmentation, masks,
surface normals, material properties, feature embeddings) are stored as
independently-timestamped label instances. Each instance stores labels of
**one type** for **one camera**, enabling sparse coverage and multiple label
sources per camera.

.. code-block:: text

camera_labels/
└── {instance_name}/ (e.g., "depth.z@front_50fov")
├── timestamps_us [N] uint64 (sorted label timestamps)
└── labels/
├── {descriptor}
│ ├── camera_id: str (associated camera identifier)
│ ├── label_type: { (tagged-union type descriptor)
│ │ "category": str, ("DEPTH", "FLOW", ...)
│ │ "qualifier": str, ("z", "optical_forward", ...)
│ │ "unit": str | null ("METERS", "PIXELS", ...)
│ │ }
│ ├── label_source: str ("GT_ANNOTATION", "EXTERNAL", ...)
│ ├── label_schema: { (storage format descriptor)
│ │ "dtype": str, (e.g., "float32", "uint8")
│ │ "shape_suffix": [int, ...], (trailing dims after [H, W])
│ │ "encoding": str, ("RAW" | "IMAGE_ENCODED")
│ │ "encoded_format": str | null, ("png", "jpeg", null)
│ │ "quantization": {...} | null (optional dequant params)
│ │ }
│ └── generic_meta_data: {...} (metadata common for all labels)
└── {timestamp_us}/ (keyed by camera end-of-frame timestamp)
├── data [H, W, ...] or |Sx (label array or encoded bytes)
└── {attrs}
├── generic_meta_data: {...} (per-label metadata)
└── format: str (IMAGE_ENCODED only)

**Label Type System:**

Labels use a *tagged-union* type consisting of a high-level
:class:`~ncore.data.LabelCategory` enum and a free-form qualifier string.
Well-known types are provided as constants (e.g., ``LabelType.DEPTH_Z_M``,
``LabelType.SEGMENTATION_SEMANTIC``), while project-specific labels use custom
qualifiers without any code changes.

Supported categories:

* ``DEPTH`` -- Per-pixel distance measures (``"z"``, ``"ray"``, ``"relative"``, ...)
* ``FLOW`` -- Motion displacement fields (``"optical_forward"``, ``"scene_backward"``, ...)
* ``SEGMENTATION`` -- Per-pixel classification (``"semantic"``, ``"instance"``, ``"logits"``)
* ``MASK`` -- Binary or multi-level masks (``"background"``, ``"dynamic"``, ``"ego"``, ...)
* ``GEOMETRY`` -- Per-pixel geometric vectors (``"normal_camera"``, ``"ray_direction"``, ...)
* ``MATERIAL`` -- Surface material properties (``"albedo"``, ``"roughness"``, ...)
* ``FEATURE`` -- Per-pixel feature embeddings (``"dinov2"``, ``"clip"``, ...)
* ``OTHER`` -- Catch-all for uncategorised labels

**Encoding:**

* ``RAW`` -- Numpy array stored as a zarr dataset regular compression. Shape
is ``[H, W] + shape_suffix`` (e.g., ``[H, W, 2]`` for optical flow).
Transparent quantization of raw labels is supported optionally
(e.g., float32 depth quantized to uint16 with scale/offset).
* ``IMAGE_ENCODED`` -- Pre-encoded image bytes (PNG, JPEG) stored as a 1-D
zarr uint8 dataset with no compression. Consumers can call ``get_encoded_data()`` for raw
bytes (GPU-based decoding) or ``get_data()`` for Pillow-decoded numpy arrays.

**Instance naming convention:**

Instance names are opaque identifiers. The recommended convention is
``category.qualifier@camera_id`` (e.g., ``depth.z@front_50fov``). The
component does *not* parse or validate instance names.

**Compat layer access:**

Labels are accessed through :class:`~ncore.data.CameraLabelsProtocol` via
:meth:`~ncore.data.SequenceLoaderProtocol.get_camera_labels` (by ID) or
:meth:`~ncore.data.SequenceLoaderProtocol.query_camera_labels` (by camera
and optional type/category filter).

Component Groups
~~~~~~~~~~~~~~~~

Expand Down
36 changes: 36 additions & 0 deletions docs/tools/ncore_vis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,42 @@ from all cameras in the sequence and shown in a shared dropdown. The opacity
of the tint is adjustable. Masks are boolean images stored per-sensor and
are not per-frame.

Camera Labels Overlay
^^^^^^^^^^^^^^^^^^^^^

When a sequence contains :class:`~ncore.data.v4.CameraLabelsComponent` data,
a **Camera Labels** overlay section appears in the Cameras tab. This allows
visualizing per-frame labels (depth maps, segmentation masks, surface normals,
etc.) on camera images.

Global controls:

- **Show Labels**: master toggle to enable camera label visualization
(disabled by default)
- **Matching**: timestamp matching strategy for sparse labels

- *Closest*: always picks the nearest available label timestamp (default)
- *Exact*: only shows a label if one exists at the camera frame's exact
timestamp; otherwise shows the RGB image

- **Opacity**: blend factor between RGB image and label visualization
(0.0 = full RGB, 1.0 = full label)

Per-camera controls:

- **Label**: dropdown showing only labels associated with this camera,
populated from :meth:`~ncore.data.SequenceLoaderProtocol.query_camera_labels`.
Defaults to the first available label so that toggling "Show Labels" on
immediately renders something.

Visualization adapts to the label category:

- **DEPTH** -- TURBO colormap with percentile-based normalization
- **SEGMENTATION** -- 20-color class palette
- **MASK** -- green tint overlay (3-channel masks rendered as RGB)
- **GEOMETRY** -- normal vectors mapped to RGB ([-1,1] → [0,255])
- **Other categories** -- grayscale normalization

Lidar Point Clouds
^^^^^^^^^^^^^^^^^^

Expand Down
16 changes: 16 additions & 0 deletions ncore/data/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
"""Package exposing methods related to NCore's basic data types and abstract APIs"""

from ncore.impl.data.compat import (
CameraLabelsProtocol,
CameraSensorProtocol,
LidarSensorProtocol,
PointCloudsSourceProtocol,
Expand All @@ -27,6 +28,7 @@
from ncore.impl.data.types import (
BBox3,
BivariateWindshieldModelParameters,
CameraLabelDescriptor,
ConcreteCameraModelParametersUnion,
ConcreteExternalDistortionParametersUnion,
ConcreteLidarModelParametersUnion,
Expand All @@ -35,10 +37,16 @@
EncodedImageHandle,
FrameTimepoint,
FThetaCameraModelParameters,
LabelCategory,
LabelEncoding,
LabelSchema,
LabelSource,
LabelType,
LabelUnit,
OpenCVFisheyeCameraModelParameters,
OpenCVPinholeCameraModelParameters,
PointCloud,
QuantizationParams,
ReferencePolynomial,
RowOffsetStructuredSpinningLidarModelParameters,
ShutterType,
Expand All @@ -64,6 +72,13 @@
"ConcreteExternalDistortionParametersUnion",
"ConcreteLidarModelParametersUnion",
"PointCloud",
"LabelCategory",
"LabelUnit",
"LabelEncoding",
"LabelType",
"QuantizationParams",
"LabelSchema",
"CameraLabelDescriptor",
# compat protocols
"SequenceLoaderProtocol",
"SensorProtocol",
Expand All @@ -72,4 +87,5 @@
"RadarSensorProtocol",
"PointCloudsSourceProtocol",
"RayBundleSensorPointCloudsSourceAdapter",
"CameraLabelsProtocol",
]
2 changes: 2 additions & 0 deletions ncore/data/v4/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

from ncore.impl.data.v4.compat import SequenceLoaderV4
from ncore.impl.data.v4.components import (
CameraLabelsComponent,
CameraSensorComponent,
ComponentReader,
ComponentWriter,
Expand Down Expand Up @@ -46,6 +47,7 @@
"RadarSensorComponent",
"CuboidsComponent",
"PointCloudsComponent",
"CameraLabelsComponent",
# compat APIs
"SequenceLoaderV4",
]
91 changes: 90 additions & 1 deletion ncore/impl/data/compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,12 +62,15 @@

from ncore.impl.common.transformations import HalfClosedInterval, PoseGraphInterpolator
from ncore.impl.data.types import (
CameraLabelDescriptor,
ConcreteCameraModelParametersUnion,
ConcreteLidarModelParametersUnion,
CuboidTrackObservation,
EncodedImageData,
FrameTimepoint,
JsonLike,
LabelCategory,
LabelType,
PointCloud,
)
from ncore.impl.data.util import closest_index_sorted
Expand Down Expand Up @@ -166,6 +169,34 @@ def get_point_clouds_source(self, source_id: str, *, return_index: int = 0) -> P
"""
...

@property
def camera_labels_ids(self) -> List[str]:
"""List of all camera label instance IDs."""
...

def get_camera_labels(self, camera_label_id: str) -> CameraLabelsProtocol:
"""Get a camera label instance by instance ID."""
...

def query_camera_labels(
self,
camera_id: str,
label_type: Optional[LabelType] = None,
label_category: Optional[LabelCategory] = None,
) -> List[CameraLabelsProtocol]:
"""Query camera label instances matching filters.

Parameters
----------
camera_id
Camera ID to match.
label_type
If provided, only return sources with this exact label type.
label_category
If provided, only return sources whose label type category matches.
"""
...


@runtime_checkable
class SensorProtocol(Protocol):
Expand Down Expand Up @@ -611,7 +642,7 @@ def get_pc_generic_data(self, pc_index: int, name: str) -> npt.NDArray:
...

def get_pc_generic_meta_data(self, pc_index: int) -> Dict[str, JsonLike]:
"""Return generic JSON metadata associated with the given point cloud."""
"""Returns generic point cloud meta-data for a specific point-cloud."""
...

def get_pc_index_range(
Expand Down Expand Up @@ -816,3 +847,61 @@ def get_pc_index_range(
step: Optional[int] = None,
) -> range:
return range(*slice(start, stop, step).indices(self.pcs_count))


@runtime_checkable
class CameraLabelsProtocol(Protocol):
"""Protocol for accessing camera-associated image labels.

Each instance provides labels of one type for one camera, with independently-managed timestamps.
"""

@property
def label_descriptor(self) -> CameraLabelDescriptor:
"""Descriptor of this label instance."""
...

@property
def labels_count(self) -> int:
"""Number of stored labels."""
...

@property
def label_timestamps_us(self) -> npt.NDArray[np.uint64]:
"""Timestamps of all stored labels, sorted ascending."""
...

@property
def labels_generic_meta_data(self) -> Dict[str, JsonLike]:
"""Generic metadata associated with all labels."""
...

@runtime_checkable
class CameraLabelHandleProtocol(Protocol):
"""Protocol for a single camera label at a specific timestamp.

Returned by :meth:`CameraLabelsProtocol.get_label` and provides deferred
access to the label data and metadata.
"""

@property
def timestamp_us(self) -> int:
"""Timestamp of this label in microseconds (usually associated with the camera end-of-frame timestamp)."""
...

@property
def generic_meta_data(self) -> Dict[str, JsonLike]:
"""Per-label generic metadata."""
...

def get_data(self) -> npt.NDArray[Any]:
"""Load and return the label data as a numpy array."""
...

def get_encoded_data(self) -> Optional[bytes]:
"""Return raw encoded bytes for IMAGE_ENCODED labels, or None for RAW labels."""
...

def get_label(self, timestamp_us: int) -> CameraLabelHandleProtocol:
"""Return a lazy handle to the label data at the given timestamp."""
...
Loading
Loading