Skip to content

Automate dataset download in demo_sarxarray.ipynb #89

@gmarupilla

Description

@gmarupilla

docs/notebooks/demo_sarxarray.ipynb assumes the nl_amsterdam_s1_asc_t088/ directory is already present in the working directory. The notebook's setup markdown points to the figshare landing page and asks the reader to download and unzip manually.

If the dataset isn't downloaded, path.rglob('*_cint_srd.raw') returns an empty list, the cell completes without warning, and the next cell crashes with a confusing xarray error (see #88 for that side of it).

A small download cell near the top of the notebook would make the demo one-click reproducible:

import urllib.request, zipfile, pathlib
url = "https://api.figshare.com/v2/file/download/41012180"
zip_path = pathlib.Path("nl_amsterdam.zip")
if not pathlib.Path("nl_amsterdam_s1_asc_t088").exists():
    urllib.request.urlretrieve(url, zip_path)
    zipfile.ZipFile(zip_path).extractall(".")
    zip_path.unlink()

Or, if you'd rather not pull a 599 MB file unconditionally, pooch handles caching + checksum verification nicely and is widely used in the Pangeo ecosystem.

Tested locally: the notebook runs end-to-end on Python 3.14.4, xarray 2026.4.0, dask 2026.3.0 once the dataset is in place.

(Filed in the context of JOSS review openjournals/joss-reviews#10492.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions