Skip to content

Commit 4e4307a

Browse files
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
1 parent 1754d44 commit 4e4307a

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

src/posts/flexible-indexing/index.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -17,20 +17,20 @@ _TLDR: Xarray>2025.6 has been through a major refactoring of its internals that
1717

1818
# Exciting new ways to slice and dice your data with Xarray!
1919

20-
First thing's first, *what is an `index` and why is it helpful?*
20+
First thing's first, _what is an `index` and why is it helpful?_
2121

22-
> An *index* makes repeated subsetting and selection of data more efficient.
22+
> An _index_ makes repeated subsetting and selection of data more efficient.
2323
2424
Examples of indexes are all around you and are a fundamental way to organize and simplify access to information. If you want a book about Natural Sciences, you can go to your local library branch and head straight to section `500`. Or if you're in the mood for a good novel go to section `800` thanks to the Dewey Decimal System [(credit to Dewey, 1876)](https://en.wikipedia.org/wiki/Dewey_Decimal_Classification)!
2525

26-
Some indexes are less universal and more multi-dimensional: In my local grocery store I know that aisle 12, top shelf has the best cereal. And the second shelf on aisle 1 has the yogurt. In this example, *aisles 1-12 and shelves 1-5 are the coordinates* of our grocery, but the more infomative *aisle content labels* are the indexes. Once you've mentally asigned labels to your grocery, you can get what you want quickly without needing to wander around!
26+
Some indexes are less universal and more multi-dimensional: In my local grocery store I know that aisle 12, top shelf has the best cereal. And the second shelf on aisle 1 has the yogurt. In this example, _aisles 1-12 and shelves 1-5 are the coordinates_ of our grocery, but the more infomative _aisle content labels_ are the indexes. Once you've mentally asigned labels to your grocery, you can get what you want quickly without needing to wander around!
2727

28-
The same efficiencies arise in computing. Consider a simple 1D dataset consisting of measurements `Y=[10,20,30,40,50,60]` at six positions `X=[1, 2, 4, 8, 16, 32]`. *What was our measurement at `X=8`?*
28+
The same efficiencies arise in computing. Consider a simple 1D dataset consisting of measurements `Y=[10,20,30,40,50,60]` at six positions `X=[1, 2, 4, 8, 16, 32]`. _What was our measurement at `X=8`?_
2929

30-
To extract the answer in code we can loop over *all* the values of `X` to find `X=8`. In Python conventions we find it at position 3, then use that to get our answer `Y[3]=40`.
30+
To extract the answer in code we can loop over _all_ the values of `X` to find `X=8`. In Python conventions we find it at position 3, then use that to get our answer `Y[3]=40`.
3131

3232
> 💡 **Note:**
33-
With only 6 coordinates, we easily see `X[3]=8`, but for large datasets we should loop over *all* the coordinates to ensure there are no repeated values! This initial pass over all the coordinates to build an *index* takes some time and may not always be desireable.
33+
> With only 6 coordinates, we easily see `X[3]=8`, but for large datasets we should loop over _all_ the coordinates to ensure there are no repeated values! This initial pass over all the coordinates to build an _index_ takes some time and may not always be desireable.
3434
3535
## Pandas.Index
3636

@@ -77,9 +77,9 @@ A lot of work over the last several years has gone into the nuts and bolts of Xa
7777

7878
> real-world datasets are usually more than just raw numbers; they have labels which encode information about how the array values map to locations in space, time, etc. [Xarray Docs](https://docs.xarray.dev/en/stable/getting-started-guide/why-xarray.html#what-labels-enable)
7979
80-
We often think about metadata providing context for *measurement values* but metadata is also critical for coordinates! In particular, to align two different datasets we must ask if the coordinates are in the same coordinate system. In other words, do they share the same origin and scale?
80+
We often think about metadata providing context for _measurement values_ but metadata is also critical for coordinates! In particular, to align two different datasets we must ask if the coordinates are in the same coordinate system. In other words, do they share the same origin and scale?
8181

82-
There are currently over 7000 commonly used [Coordinate Reference Systems (CRS)](https://spatialreference.org/ref/epsg/) for geospatial data in the authoritative EPSG database! And of course an infinite number of custom-defined CRSs. [xproj.CRSIndex](https://xproj.readthedocs.io/en/latest/) gives Xarray objects an automatic awareness of the coordinate reference system operations like `xr.align()` no longer succeed when they should raise an error:
82+
There are currently over 7000 commonly used [Coordinate Reference Systems (CRS)](https://spatialreference.org/ref/epsg/) for geospatial data in the authoritative EPSG database! And of course an infinite number of custom-defined CRSs. [xproj.CRSIndex](https://xproj.readthedocs.io/en/latest/) gives Xarray objects an automatic awareness of the coordinate reference system operations like `xr.align()` no longer succeed when they should raise an error:
8383

8484
```python
8585
from xproj import CRSIndex
@@ -96,7 +96,7 @@ MergeError: conflicting values/indexes on objects to be combined for coordinate
9696

9797
### Rasterix RasterIndex
9898

99-
Earlier we mentioned that coordinates often have a *functional representation*. For 2D geospatial raster images, this function often takes the form of an [Affine Transform](https://en.wikipedia.org/wiki/Affine_transformation). This how the [rasterix RasterIndex](https://github.com/xarray-contrib/rasterix) computes coordinates rather than storing them all in memory. Also alignment by comparing transforms minimizes common errors due to floating point mismatches.
99+
Earlier we mentioned that coordinates often have a _functional representation_. For 2D geospatial raster images, this function often takes the form of an [Affine Transform](https://en.wikipedia.org/wiki/Affine_transformation). This how the [rasterix RasterIndex](https://github.com/xarray-contrib/rasterix) computes coordinates rather than storing them all in memory. Also alignment by comparing transforms minimizes common errors due to floating point mismatches.
100100

101101
Below is a simple example of slicing a large mosaic of GeoTiffs without ever loading the coordiantes into memory, note that a new Affine is defined after the slicing operation:
102102

@@ -171,14 +171,14 @@ subset['population'].xvec.plot(col='year');
171171

172172
![xvec figure](/posts/flexible-indexes/xvecfig.png)
173173

174-
175174
## What's next?
176175

177176
While we're extremely excited about what can _already_ be accomplished with the new indexing capabilities, there are plenty of exciting ideas for future work.
178177

179178
Have an idea for your own custom index? Check out [this section of the Xarray documentation](https://docs.xarray.dev/en/stable/internals/how-to-create-custom-index.html). Also check out the [A Gallery of Custom Index Examples](https://xarray-indexes.readthedocs.io)!
180179

181180
There are a few new indexes that will soon become part of the Xarray codebase!
181+
182182
- [IntervalIndex](https://github.com/pydata/xarray/pull/10296)
183183
- [NDPointIndex (KDTree)](https://github.com/pydata/xarray/pull/10478) # This is done?!
184184

0 commit comments

Comments
 (0)