Add sample() method for efficient point extraction from scattered coordinates

## The problem

`MultiBandStacBackendArray` declares `IndexingSupport.BASIC`, so xarray collapses any vectorized point selection into a bounding-box slice before calling `_raw_getitem`. If you do:

```python
da.sel(x=xr.DataArray(xs, dims="points"), y=xr.DataArray(ys, dims="points"))
```

lazycogs receives a single slice spanning the bounding rectangle of all your points, opens every COG overlapping that rectangle, and reads every pixel inside it. For sparse points over a large area this is very wasteful.

The per-point workaround — iterating and calling `.sel()` once per coordinate — is safe (each call resolves to a 1×1 spatial window) but doesn't batch COG opens. If N points land in the same source tile, that tile is opened N times.

This is a real gap for common workflows: training data extraction at labeled sites, validation against field measurements, time series at monitoring station locations.

## Proposed solution

A `lazycogs.sample()` function (and/or a `.lazycogs.sample()` accessor method) that bypasses xarray's indexing machinery and batches reads by item footprint:

1. Accept arrays of x/y coordinates in the array's CRS.
2. Per time step: one DuckDB query using the bbox of all points combined.
3. Per returned item: find which of the N points fall within its footprint, open the COG **once**, read the minimal window spanning those points, extract the specific pixel values.
4. Apply the mosaic method across items at each point (same priority ordering as the regular mosaic).

Return a `DataArray` with a `points` dimension, shape `(band, time, points)`.

## Implementation notes

The main piece of new logic is `_points_in_item`: given a STAC item and a set of pixel coordinates, determine which points fall within the item's spatial footprint. Item geometries are available in the parquet via DuckDB — no extra I/O needed.

The COG read per item would use `async_geotiff`'s existing window read: compute the bounding pixel window for all matching points within that item, read it once, then index out the specific pixel values using local offsets. This reuses `_read_item_bands` and the warp cache logic with minimal changes.

The `FirstMethod` early-exit optimization applies here too: once all N points have a valid value, remaining items are skipped.

## Risks and unknowns

- Items with irregular or non-rectangular footprints (e.g., scene edge polygons) need a proper point-in-polygon test, not just a bbox check. Need to decide whether to use the item geometry from the parquet or fall back to bbox.
- Reprojection for points in a different CRS than the source COG adds complexity. The current `reproject_array` approach works on rectangular windows — scattered pixel extraction in native coordinates may need a different path.
- The interface for dask parallelism over time steps is not yet designed. Simplest first cut is synchronous over time steps; dask support can come later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sample() method for efficient point extraction from scattered coordinates #16

The problem

Proposed solution

Implementation notes

Risks and unknowns

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add sample() method for efficient point extraction from scattered coordinates #16

Description

The problem

Proposed solution

Implementation notes

Risks and unknowns

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions