Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 50 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ZedProfiler

[![Coverage](https://img.shields.io/badge/coverage-96%25-brightgreen)](#quality-gates)
[![Coverage](https://img.shields.io/badge/coverage-94%25-brightgreen)](#quality-gates)

CPU-first 3D image feature extraction toolkit for high-content and high-throughput image-based profiling.

Expand All @@ -18,12 +18,55 @@ just env
## Quick usage (API shape)

```python
from zedprofiler import colocalization # promoted convenience import
from zedprofiler.featurization import texture # stays in feature namespace

# Implementation is scaffolded in PR 1 and will be completed in module PRs.
result = colocalization.compute()
texture_result = texture.compute()
import os
import pathlib
import pandas as pd

from zedprofiler.IO.loading_classes import ImageSetConfig, ImageSetLoader, ObjectLoader
from zedprofiler.featurization.volumesizeshape import compute as compute_volumesizeshape

image_set_path = pathlib.Path(
os.path.expanduser(
"~/mnt/bandicoot/NF1_organoid_data/data/NF0014_T1/zstack_images/C2-1/"
)
).resolve(strict=True)
mask_set_path = pathlib.Path(
os.path.expanduser(
"~/mnt/bandicoot/NF1_organoid_data/data/NF0014_T1/segmentation_masks/C2-1/"
)
).resolve(strict=True)
image_set_config = ImageSetConfig(
image_set_name="test_set",
raw_image_key_name=["AGP"],
mask_key_name=["Nuclei"],
)
image_set_loader = ImageSetLoader(
image_set_path=image_set_path,
mask_set_path=mask_set_path,
anisotropy_spacing=(1,0.1,0.1),
channel_mapping={
"DNA": 405,
"ER": 488,
"AGP": 555,
"Mito": 640,
"Organoid": "organoid_",
"Cell": "cell_",
"Nuclei": "nuclei_",
"Cytoplasm": "cytoplasm_",
},
config=image_set_config
)

object_loader = ObjectLoader(
image_set_loader=image_set_loader,
channel_name=image_set_config.raw_image_key_name[0],
compartment_name=image_set_config.mask_key_name[0],
)
area_dict_df = pd.DataFrame(compute_volumesizeshape(
image_set_loader=image_set_loader,
object_loader=object_loader,
))
area_dict_df
```

## Data Contract
Expand Down
9 changes: 7 additions & 2 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ The roadmap is intended to be a living document and may be updated as needed.
### In scope

- [ ] Handcrafted featurization modules:
- [ ] AreaSizeShape
- [ ] volumesizeshape
- [ ] Colocalization
- [ ] Intensity
- [ ] Granularity
Expand Down Expand Up @@ -51,7 +51,7 @@ The roadmap is intended to be a living document and may be updated as needed.

### Phase 2: Feature modules and tests (PR 4-9)

4. PR 4: AreaSizeShape module and tests
4. PR 4: volumesizeshape module and tests

- [ ] CPU implementation, anisotropy handling, edge cases.

Expand Down Expand Up @@ -93,6 +93,11 @@ The roadmap is intended to be a living document and may be updated as needed.

- [ ] Changelog, semantic tag, PyPI publish automation, README install updates.

### Phase 4: Post-release improvements (PR 14+)

14. Add the ability for loading of images from array instead of file
file paths for increased flexibility in data sources and testing.

## Verification Gates

- [ ] Run full unit and integration tests on Linux with coverage >=85%.
Expand Down
6 changes: 3 additions & 3 deletions docs/src/Feature_Naming_Convention.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ The `<channel>` component:
- MAY use pascalcase capitalization
- MAY use hyphen-separated channel combinations for colocalization features (e.g., `DNA-mito`)
- MUST list channels in alphabetical order when combined (e.g., `DNA-mito` not `mito-DNA`)
- MUST be set to `NoChannel` for channel-independent features (e.g., areasizeshape)
- MUST be set to `NoChannel` for channel-independent features (e.g., volumesizeshape)

**Example:** `DNA`, `Mito`, `DNA-Mito`

Expand All @@ -97,7 +97,7 @@ The `<featuretype>` component:

- MUST identify the category or method of feature extraction
- MUST be one of the following enumerated values:
- `Areasizeshape` - morphological measurements (area, volume, shape descriptors)
- `volumesizeshape` - morphological measurements (area, volume, shape descriptors)
- `Colocalization` - channel colocalization metrics
- `Granularity` - granular spectrum and texture-at-scale features
- `Intensity` - pixel intensity statistics
Expand Down Expand Up @@ -138,7 +138,7 @@ Valid feature names conforming to this specification:
Nuclei_DNA_Intensity_MeanIntensity
Cytoplasm_Mito_Texture_Entropy-256-3
Cell_DNA-Mito_Colocalization_Correlation
Organoid_NoChannel_AreaSizeShape_Volume
Organoid_NoChannel_volumesizeshape_Volume
Nuclei_NoChannel_Neighbors_AdjacentCount
Cell_Mito_Granularity_Spectrum-10
Nuclei_DNA_SAMMed3D_CLSFeature-512
Expand Down
6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ dependencies = [
"bioio-tifffile>=1.3",
"fire>=0.7.1",
"jinja2>=3.1.6",
"numpy>=2.2",
"numpy>=1.26",
"pandas>=3.0.2",
"scikit-image>=0.25",
"scipy>=1.15",
"pyarrow>=16",
"scikit-image>=0.24",
]
scripts.ZedProfiler = "ZedProfiler.cli:trigger"

Expand Down
2 changes: 1 addition & 1 deletion src/zedprofiler/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@
# This keeps imports ergonomic while preserving the canonical nested namespace:
# `zedprofiler.featurization.<module>`.
from zedprofiler.featurization import (
areasizeshape,
colocalization,
granularity,
intensity,
neighbors,
texture,
volumesizeshape,
)
9 changes: 8 additions & 1 deletion src/zedprofiler/featurization/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,11 @@
modules to be promoted at the package top-level.
"""

from . import areasizeshape, colocalization, granularity, intensity, neighbors, texture
from . import (
colocalization,
granularity,
intensity,
neighbors,
texture,
volumesizeshape,
)
10 changes: 0 additions & 10 deletions src/zedprofiler/featurization/areasizeshape.py

This file was deleted.

177 changes: 177 additions & 0 deletions src/zedprofiler/featurization/volumesizeshape.py
Comment thread
MikeLippincott marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
"""Volume, size, and shape features for 3D objects."""

from __future__ import annotations

from collections.abc import Sequence
from importlib import import_module
from typing import Protocol

import numpy as np

from zedprofiler.exceptions import ZedProfilerError


class SupportsImageSetLoader(Protocol):
"""Minimal image-set loader interface required by this module."""

# voxel size in z,y,x space
anisotropy_spacing: tuple[float, float, float]


class SupportsObjectLoader(Protocol):
"""Minimal object loader interface required by this module."""

# label image the image which contains the labeled objects,
# where each object is represented by a unique integer label
# (0 is typically reserved for background)
label_image: np.ndarray
object_ids: Sequence[int]


def _empty_feature_result() -> dict[str, list[float]]:
"""Return deterministic empty output schema for area/size/shape features."""
return {
"object_id": [],
"Volume": [],
"CenterX": [],
"CenterY": [],
"CenterZ": [],
"BboxVolume": [],
"MinX": [],
"MaxX": [],
"MinY": [],
"MaxY": [],
"MinZ": [],
"MaxZ": [],
"Extent": [],
"EulerNumber": [],
"EquivalentDiameter": [],
"SurfaceArea": [],
}


def compute(
image_set_loader: SupportsImageSetLoader | None = None,
object_loader: SupportsObjectLoader | None = None,
) -> dict[str, list[float]]:
"""Compute volume/size/shape features for one object loader.

This supports two invocation modes:
- no arguments: returns an empty deterministic schema so dispatchers can
call the function without crashing.
- both loaders provided: executes feature extraction.
"""
if image_set_loader is None and object_loader is None:
return _empty_feature_result()
if image_set_loader is None or object_loader is None:
raise ZedProfilerError(
"volumesizeshape.compute requires both image_set_loader and "
"object_loader for execution."
)

return measure_3D_volume_size_shape(
image_set_loader=image_set_loader,
object_loader=object_loader,
)


def _get_skimage_measure() -> object:
"""Return `skimage.measure` or raise a user-facing dependency error."""
try:
return import_module("skimage.measure")
except ImportError as exc:
raise ZedProfilerError(
"volumesizeshape requires scikit-image for area/size/shape computation."
) from exc


def calculate_surface_area(
label_object: np.ndarray,
props: dict[str, np.ndarray],
spacing: tuple[float, float, float],
) -> float:
"""Calculate surface area for one labeled object using marching cubes."""
measure = _get_skimage_measure()

volume = label_object[
max(props["bbox-0"][0], 0) : min(props["bbox-3"][0], label_object.shape[0]),
max(props["bbox-1"][0], 0) : min(props["bbox-4"][0], label_object.shape[1]),
max(props["bbox-2"][0], 0) : min(props["bbox-5"][0], label_object.shape[2]),
]
volume_truths = volume > 0
verts, faces, _normals, _values = measure.marching_cubes(
volume_truths,
method="lewiner",
spacing=spacing,
level=0,
)
return measure.mesh_surface_area(verts, faces)


def measure_3D_volume_size_shape(
image_set_loader: SupportsImageSetLoader,
object_loader: SupportsObjectLoader,
) -> dict[str, list[float]]:
"""Measure volume/size/shape features for each non-zero label object."""
measure = _get_skimage_measure()

label_object = object_loader.label_image
spacing = image_set_loader.anisotropy_spacing
unique_objects = object_loader.object_ids

features_to_record = _empty_feature_result()

desired_properties = [
"area", # for 3D it is volume but skimage uses "area" naming for the property
"bbox",
"centroid",
"bbox_area",
"extent",
"euler_number",
"equivalent_diameter",
]
for label in unique_objects:
# avoid the 0 index which is the background and not an object,
if label == 0:
continue
subset_lab_object = label_object.copy()
# subset here means zeroing out all other objects except the
# one we want to measure, so that we can use
# skimage's regionprops_table to compute
# features for that object
subset_lab_object[subset_lab_object != label] = 0
props = measure.regionprops_table(
subset_lab_object,
properties=desired_properties,
)

features_to_record["object_id"].append(label)
features_to_record["Volume"].append(props["area"].item())
features_to_record["CenterX"].append(props["centroid-2"].item())
features_to_record["CenterY"].append(props["centroid-1"].item())
features_to_record["CenterZ"].append(props["centroid-0"].item())
features_to_record["BboxVolume"].append(props["bbox_area"].item())
features_to_record["MinX"].append(props["bbox-2"].item())
features_to_record["MaxX"].append(props["bbox-5"].item())
features_to_record["MinY"].append(props["bbox-1"].item())
features_to_record["MaxY"].append(props["bbox-4"].item())
features_to_record["MinZ"].append(props["bbox-0"].item())
features_to_record["MaxZ"].append(props["bbox-3"].item())
features_to_record["Extent"].append(props["extent"].item())
features_to_record["EulerNumber"].append(props["euler_number"].item())
features_to_record["EquivalentDiameter"].append(
props["equivalent_diameter"].item()
)

try:
features_to_record["SurfaceArea"].append(
calculate_surface_area(
label_object=label_object,
props=props,
spacing=spacing,
)
)
except (RuntimeError, ValueError):
features_to_record["SurfaceArea"].append(np.nan)

return features_to_record
Loading
Loading