Skip to content

Coalesce HTTP COG range reads and parse IFDs once per dask graph#1534

Open
brendancol wants to merge 1 commit intoxarray-contrib:mainfrom
brendancol:perf/http-cog-range-coalescing
Open

Coalesce HTTP COG range reads and parse IFDs once per dask graph#1534
brendancol wants to merge 1 commit intoxarray-contrib:mainfrom
brendancol:perf/http-cog-range-coalescing

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Two related fixes for the HTTP COG read path, surfaced by the recent geotiff performance audit (P2 + P5).

P2 -- merge adjacent tile ranges into fewer GETs. _read_cog_http previously fired one GET Range: per tile through an 8-worker pool, so wall time scaled as ceil(N_tiles / 8) * RTT. COG tiles sit sequentially in the file, so a new coalesce_ranges helper merges adjacent (offset, length) ranges whose gap is below a threshold (default 1 MB, configurable via XRSPATIAL_COG_COALESCE_GAP). split_coalesced_bytes slices the returned bytes back per-tile, and _HTTPSource.read_ranges_coalesced wraps the existing read_ranges so the threadpool model is unchanged.

The 1 MB gap threshold is chosen empirically: most compressed COG tiles are well under 1 MB and tile rows are stored back-to-back, so the threshold tolerates small interleaved metadata without dragging in unrelated overview data. Set the env var to -1 to disable merging entirely (the legacy behaviour) -- this is what the new perf test uses to measure the baseline.

P5 -- parse IFDs once per dask graph. read_geotiff_dask against an HTTP URL used to call read_to_array per chunk, which routed to _read_cog_http and fired a fresh 16 KB header GET every time. _read_cog_http is now split into _parse_cog_http_meta (one IFD parse) and _fetch_decode_cog_http_tiles (window-aware tile fetch + decode). read_geotiff_dask parses metadata once before constructing the graph and threads the parsed (TIFFHeader, IFD) into the delayed tasks via a new internal http_meta kwarg.

Public API is unchanged. read_geotiff_dask, _HTTPSource.read_ranges, and read_to_array all keep their existing signatures.

Reproduction numbers

Mocked 50 ms RTT, 64-tile COG:

wall time range GETs
baseline (XRSPATIAL_COG_COALESCE_GAP=-1) ~450 ms 65
coalesced (default 1 MB) ~100 ms 2

16-chunk dask graph against the same mocked HTTP COG:

  • before P5: 17 header GETs (one per chunk + one for the eager metadata read)
  • after P5: 1 header GET

Test plan

  • pytest xrspatial/geotiff/tests/test_http_cog_coalesce.py -x -q (11 new tests, all pass)
  • pytest xrspatial/geotiff/tests/test_cog.py xrspatial/geotiff/tests/test_cog_http_concurrent.py xrspatial/geotiff/tests/test_sparse_cog.py -x -q (36 tests, all pass)
  • Full geotiff test suite (795 pass, 4 skipped). The 3 failing tests in test_features.py::TestPalette predate this change (recursion in palette plotting).

Two related performance fixes for the HTTP COG read path.

P2 -- range coalescing. _read_cog_http used to fire one GET Range:
request per tile through an 8-worker pool, so wall time scaled as
ceil(N_tiles / 8) * RTT. COG tiles are stored sequentially, so a new
helper coalesce_ranges merges adjacent (offset, length) entries whose
gap is below a threshold (default 1 MB, configurable via
XRSPATIAL_COG_COALESCE_GAP) and split_coalesced_bytes slices the
returned bytes back per-tile. _HTTPSource grows a read_ranges_coalesced
wrapper that calls the existing read_ranges underneath. On a 50 ms
RTT mocked link with 64 tiles the un-coalesced path takes ~450 ms; the
coalesced path takes ~100 ms and issues 2 GETs instead of 65.

P5 -- once-per-graph IFD parsing. read_geotiff_dask used to call
read_to_array per chunk, which on HTTP routed through _read_cog_http
and fired a fresh 16 KB header GET each time. The IFD parse is now
factored into _parse_cog_http_meta and the tile fetch into
_fetch_decode_cog_http_tiles (which honours a window). read_geotiff_dask
parses metadata once before constructing the dask graph and threads the
parsed (header, ifd) into delayed tasks via an http_meta kwarg. A
16-chunk dask graph now issues a single 16 KB header GET instead of
one per chunk.

Public API is unchanged. Existing test_cog_http_concurrent suite still
passes; new tests live in test_http_cog_coalesce.py.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant