diff --git a/docs/contributing.md b/docs/contributing.md index 4bba874188..b9c7aa1aa2 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -255,7 +255,7 @@ print("Hello world") #### Building documentation without executing code blocks -Sometimes, you may want the documentation to build quicker. You can disable code block execution by commenting out the [markdown-exec](https://github.com/zarr-developers/zarr-python/blob/884a8c91afcc3efe28b3da952be3b85125c453cb/mkdocs.yml#L132 plugin in the mkdocs configuration file). This will make code blocks and cross references render incorrectly (i.e., expect build warnings), but also reduces build time by ~3x. Be sure to undo the commenting out before opening your pull request. +Sometimes, you may want the documentation to build quicker. You can disable code block execution by commenting out the [markdown-exec plugin](https://github.com/zarr-developers/zarr-python/blob/884a8c91afcc3efe28b3da952be3b85125c453cb/mkdocs.yml#L132) in the mkdocs configuration file. This will make code blocks and cross references render incorrectly (i.e., expect build warnings), but also reduces build time by ~3x. Be sure to undo the commenting out before opening your pull request. ### Changelog diff --git a/docs/user-guide/arrays.md b/docs/user-guide/arrays.md index 6a0ecfda92..4b52629645 100644 --- a/docs/user-guide/arrays.md +++ b/docs/user-guide/arrays.md @@ -512,7 +512,7 @@ print(np.all(z.oindex[[0, 2], :] == z[[0, 2], :])) ### Block Indexing -Zarr also support block indexing, which allows selections of whole chunks based on their +Zarr also supports block indexing, which allows selections of whole chunks based on their logical indices along each dimension of an array. For example, this allows selecting a subset of chunk aligned rows and/or columns from a 2-dimensional array. E.g.: diff --git a/docs/user-guide/consolidated_metadata.md b/docs/user-guide/consolidated_metadata.md index c5cd31e5fc..d8066b6846 100644 --- a/docs/user-guide/consolidated_metadata.md +++ b/docs/user-guide/consolidated_metadata.md @@ -17,7 +17,7 @@ If consolidated metadata is present in a Zarr Group's metadata then it is used by default. The initial read to open the group will need to communicate with the store (reading from a file for a [`zarr.storage.LocalStore`][], making a network request for a [`zarr.storage.FsspecStore`][]). After that, any subsequent -metadata reads get child Group or Array nodes will *not* require reads from the store. +metadata reads to get child Group or Array nodes will *not* require reads from the store. In Python, the consolidated metadata is available on the `.consolidated_metadata` attribute of the `GroupMetadata` object. @@ -49,7 +49,7 @@ print(result) ``` If we open that group, the Group's metadata has a `zarr.core.group.ConsolidatedMetadata` -that can be used.: +that can be used: ```python exec="true" session="consolidated_metadata" source="above" result="ansi" from pprint import pprint @@ -58,19 +58,19 @@ import io consolidated = zarr.open_group(store="memory://consolidated-metadata-demo") consolidated_metadata = consolidated.metadata.consolidated_metadata.metadata -# Note: pprint can be users without capturing the output regularly +# Note: pprint can be used without capturing the output regularly output = io.StringIO() pprint(dict(sorted(consolidated_metadata.items())), stream=output, width=60) print(output.getvalue()) ``` -Operations on the group to get children automatically use the consolidated metadata.: +Operations on the group to get children automatically use the consolidated metadata: ```python exec="true" session="consolidated_metadata" source="above" result="ansi" print(consolidated['a']) # no read / HTTP request to the Store is required ``` -With nested groups, the consolidated metadata is available on the children, recursively.: +With nested groups, the consolidated metadata is available on the children, recursively: ```python exec="true" session="consolidated_metadata" source="above" result="ansi" child = group.create_group('child', attributes={'kind': 'child'}) diff --git a/docs/user-guide/data_types.md b/docs/user-guide/data_types.md index 3cdafb5f28..3e10845979 100644 --- a/docs/user-guide/data_types.md +++ b/docs/user-guide/data_types.md @@ -262,7 +262,7 @@ here, it's possible to create it yourself: see [Adding New Data Types](#adding-n ### Example Usage -This section will demonstrates the basic usage of Zarr data types. +This section will demonstrate the basic usage of Zarr data types. Create a `ZDType` from a native data type: diff --git a/docs/user-guide/experimental.md b/docs/user-guide/experimental.md index c7f86337a3..1c6d952c7c 100644 --- a/docs/user-guide/experimental.md +++ b/docs/user-guide/experimental.md @@ -4,7 +4,7 @@ This section contains documentation for experimental Zarr Python features. The f ## `CacheStore` -Zarr Python 3.1.4 adds [`zarr.experimental.cache_store.CacheStore`][] provides a dual-store caching implementation +Zarr Python 3.1.4 adds [`zarr.experimental.cache_store.CacheStore`][], which provides a dual-store caching implementation that can be wrapped around any Zarr store to improve performance for repeated data access. This is particularly useful when working with remote stores (e.g., S3, HTTP) where network latency can significantly impact data access speed. diff --git a/docs/user-guide/extending.md b/docs/user-guide/extending.md index 39444135df..cefeb114c5 100644 --- a/docs/user-guide/extending.md +++ b/docs/user-guide/extending.md @@ -78,11 +78,11 @@ Coming soon. ## Custom array buffers -Zarr-python provides control over where and how arrays stored in memory through +Zarr-python provides control over where and how arrays are stored in memory through [`zarr.abc.buffer.Buffer`][]. Currently both CPU (the default) and GPU implementations are provided (see [Using GPUs with Zarr](gpu.md) for more information). You can implement your own buffer classes by implementing the interface defined in [`zarr.abc.buffer.BufferPrototype`][]. ## Other extensions -In the future, Zarr will support writing custom custom data types and chunk grids. +In the future, Zarr will support writing custom data types and chunk grids. diff --git a/docs/user-guide/gpu.md b/docs/user-guide/gpu.md index ff86263cf0..6189f39d3d 100644 --- a/docs/user-guide/gpu.md +++ b/docs/user-guide/gpu.md @@ -7,7 +7,7 @@ Zarr can use GPUs to accelerate your workload by running `zarr.Config.enable_gpu memory as the final stage of the codec pipeline. Data will still be read into or copied to host (CPU) memory for encoding and decoding. - In the future, codecs will be available compressing and decompressing data on + In the future, codecs will be available for compressing and decompressing data on the GPU, avoiding the need to move data between the host and device for compression and decompression. diff --git a/docs/user-guide/groups.md b/docs/user-guide/groups.md index 7664e4c1dc..5faa26a281 100644 --- a/docs/user-guide/groups.md +++ b/docs/user-guide/groups.md @@ -69,7 +69,7 @@ For more information on groups see the [`zarr.Group` API docs](../api/zarr/group ## Batch Group Creation You can also create multiple groups concurrently with a single function call. [`zarr.create_hierarchy`][] takes -a [`zarr Storage instance`](../api/zarr/storage.md) instance and a dict of `key : metadata` pairs, parses that dict, and +a [`zarr Storage`](../api/zarr/storage.md) instance and a dict of `key : metadata` pairs, parses that dict, and writes metadata documents to storage: ```python exec="true" session="groups" source="above" result="ansi" diff --git a/docs/user-guide/performance.md b/docs/user-guide/performance.md index 6d3ec063d2..6514b35a30 100644 --- a/docs/user-guide/performance.md +++ b/docs/user-guide/performance.md @@ -296,7 +296,7 @@ Zarr arrays and groups can be pickled, as long as the underlying store object ca pickled. With the exception of the `zarr.storage.MemoryStore`, any of the storage classes provided in the `zarr.storage` module can be pickled. -If an array or group is backed by a persistent store such as the a `zarr.storage.LocalStore`, +If an array or group is backed by a persistent store such as a `zarr.storage.LocalStore`, `zarr.storage.ZipStore` or `zarr.storage.FsspecStore` then the store data **are not** pickled. The only thing that is pickled is the necessary parameters to allow the store to re-open any underlying files or databases upon being unpickled. diff --git a/docs/user-guide/storage.md b/docs/user-guide/storage.md index e75cd21381..d5f840ab4b 100644 --- a/docs/user-guide/storage.md +++ b/docs/user-guide/storage.md @@ -119,7 +119,7 @@ print(array) ### Remote Store -The [`zarr.storage.FsspecStore`][] stores the contents of a Zarr hierarchy in following the same +The [`zarr.storage.FsspecStore`][] stores the contents of a Zarr hierarchy following the same logical layout as the [`LocalStore`][zarr.storage.LocalStore], except the store is assumed to be on a remote storage system such as cloud object storage (e.g. AWS S3, Google Cloud Storage, Azure Blob Store). The [`zarr.storage.FsspecStore`][] is backed by [fsspec](https://filesystem-spec.readthedocs.io) and can support any backend @@ -154,7 +154,7 @@ print(store) ### Memory Store -The [`zarr.storage.MemoryStore`][] an in-memory store that allows for serialization of +The [`zarr.storage.MemoryStore`][] is an in-memory store that allows for serialization of Zarr data (metadata and chunks) to a dictionary: ```python exec="true" session="storage" source="above" result="ansi" diff --git a/docs/user-guide/v3_migration.md b/docs/user-guide/v3_migration.md index b700bf7a25..21386c1522 100644 --- a/docs/user-guide/v3_migration.md +++ b/docs/user-guide/v3_migration.md @@ -126,7 +126,7 @@ The following sections provide details on breaking changes in Zarr-Python 3. ### The Store class -The Store API has changed significant in Zarr-Python 3. +The Store API has changed significantly in Zarr-Python 3. #### The base store class diff --git a/src/zarr/api/asynchronous.py b/src/zarr/api/asynchronous.py index 19b3c0c3fa..d5e8e64f9c 100644 --- a/src/zarr/api/asynchronous.py +++ b/src/zarr/api/asynchronous.py @@ -933,10 +933,9 @@ async def create( and these values must be instances of [`zarr.abc.codec.ArrayArrayCodec`][], or a dict representations of [`zarr.abc.codec.ArrayArrayCodec`][]. - For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the - the order if your filters is consistent with the behavior of each filter. + For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the order of your filters is consistent with the behavior of each filter. - The default value of ``"auto"`` instructs Zarr to use a default used based on the data + The default value of ``"auto"`` instructs Zarr to use a default based on the data type of the array and the Zarr format specified. For all data types in Zarr V3, and most data types in Zarr V2, the default filters are empty. The only cases where default filters are not empty is when the Zarr format is 2, and the data type is a variable-length data type like diff --git a/src/zarr/core/array.py b/src/zarr/core/array.py index b7e7fa1b35..2e8ca5445d 100644 --- a/src/zarr/core/array.py +++ b/src/zarr/core/array.py @@ -181,7 +181,7 @@ class DefaultFillValue: A string like ``"auto"`` is ambiguous because such a string is a valid fill value for an array with a string data type. An instance of this class lies outside the space of valid fill values, which means it can - umambiguously express that the default fill value should be used. + unambiguously express that the default fill value should be used. """ @@ -1339,7 +1339,7 @@ def _iter_shard_keys( ------ key: str The storage key of each shard in the selection or in case of no shard - present of each chunk although the latter case as technically incorrect. + present of each chunk although the latter case is technically incorrect. """ # Iterate over the coordinates of chunks in chunk grid space. return _iter_shard_keys( @@ -4079,7 +4079,7 @@ async def from_array( dict representations of [`zarr.abc.codec.ArrayArrayCodec`][]. For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the - the order if your filters is consistent with the behavior of each filter. + order of your filters is consistent with the behavior of each filter. The default value of ``"keep"`` instructs Zarr to infer ``filters`` from ``data``. If that inference is not possible, Zarr will fall back to the behavior specified by ``"auto"``, @@ -4096,7 +4096,7 @@ async def from_array( filters are applied (if any are specified) and the data is serialized into bytes. For Zarr format 3, a "compressor" is a codec that takes a bytestream, and - returns another bytestream. Multiple compressors my be provided for Zarr format 3. + returns another bytestream. Multiple compressors may be provided for Zarr format 3. For Zarr format 2, a "compressor" can be any numcodecs codec. Only a single compressor may be provided for Zarr format 2. @@ -4124,7 +4124,7 @@ async def from_array( Fill value for the array. If not specified, defaults to the fill value of the data array. order : {"C", "F"}, optional - The memory of the array (default is "C"). + The memory order of the array (default is "C"). For Zarr format 2, this parameter sets the memory order of the array. For Zarr format 3, this parameter is deprecated, because memory order is a runtime parameter for Zarr format 3 arrays. The recommended way to specify the memory @@ -4326,9 +4326,9 @@ async def init_array( dict representations of [`zarr.abc.codec.ArrayArrayCodec`][]. For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the - the order if your filters is consistent with the behavior of each filter. + order of your filters is consistent with the behavior of each filter. - The default value of ``"auto"`` instructs Zarr to use a default used based on the data + The default value of ``"auto"`` instructs Zarr to use a default based on the data type of the array and the Zarr format specified. For all data types in Zarr V3, and most data types in Zarr V2, the default filters are empty. The only cases where default filters are not empty is when the Zarr format is 2, and the data type is a variable-length data type like @@ -4354,7 +4354,7 @@ async def init_array( fill_value : Any, optional Fill value for the array. order : {"C", "F"}, optional - The memory of the array (default is "C"). + The memory order of the array (default is "C"). For Zarr format 2, this parameter sets the memory order of the array. For Zarr format 3, this parameter is deprecated, because memory order is a runtime parameter for Zarr format 3 arrays. The recommended way to specify the memory @@ -4576,9 +4576,9 @@ async def create_array( dict representations of [`zarr.abc.codec.ArrayArrayCodec`][]. For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the - the order if your filters is consistent with the behavior of each filter. + order of your filters is consistent with the behavior of each filter. - The default value of ``"auto"`` instructs Zarr to use a default used based on the data + The default value of ``"auto"`` instructs Zarr to use a default based on the data type of the array and the Zarr format specified. For all data types in Zarr V3, and most data types in Zarr V2, the default filters are empty. The only cases where default filters are not empty is when the Zarr format is 2, and the data type is a variable-length data type like @@ -4591,7 +4591,7 @@ async def create_array( filters are applied (if any are specified) and the data is serialized into bytes. For Zarr format 3, a "compressor" is a codec that takes a bytestream, and - returns another bytestream. Multiple compressors my be provided for Zarr format 3. + returns another bytestream. Multiple compressors may be provided for Zarr format 3. If no ``compressors`` are provided, a default set of compressors will be used. These defaults can be changed by modifying the value of ``array.v3_default_compressors`` in [`zarr.config`][zarr.config]. @@ -4611,7 +4611,7 @@ async def create_array( fill_value : Any, optional Fill value for the array. order : {"C", "F"}, optional - The memory of the array (default is "C"). + The memory order of the array (default is "C"). For Zarr format 2, this parameter sets the memory order of the array. For Zarr format 3, this parameter is deprecated, because memory order is a runtime parameter for Zarr format 3 arrays. The recommended way to specify the memory diff --git a/src/zarr/core/group.py b/src/zarr/core/group.py index 83a9cc6000..cbce1c1ed0 100644 --- a/src/zarr/core/group.py +++ b/src/zarr/core/group.py @@ -1062,9 +1062,9 @@ async def create_array( dict representations of [`zarr.abc.codec.ArrayArrayCodec`][]. For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the - the order if your filters is consistent with the behavior of each filter. + order of your filters is consistent with the behavior of each filter. - The default value of ``"auto"`` instructs Zarr to use a default used based on the data + The default value of ``"auto"`` instructs Zarr to use a default based on the data type of the array and the Zarr format specified. For all data types in Zarr V3, and most data types in Zarr V2, the default filters are empty. The only cases where default filters are not empty is when the Zarr format is 2, and the data type is a variable-length data type like @@ -1077,7 +1077,7 @@ async def create_array( filters are applied (if any are specified) and the data is serialized into bytes. For Zarr format 3, a "compressor" is a codec that takes a bytestream, and - returns another bytestream. Multiple compressors my be provided for Zarr format 3. + returns another bytestream. Multiple compressors may be provided for Zarr format 3. If no ``compressors`` are provided, a default set of compressors will be used. These defaults can be changed by modifying the value of ``array.v3_default_compressors`` in [`zarr.config`][zarr.config]. @@ -1099,7 +1099,7 @@ async def create_array( fill_value : Any, optional Fill value for the array. order : {"C", "F"}, optional - The memory of the array (default is "C"). + The memory order of the array (default is "C"). For Zarr format 2, this parameter sets the memory order of the array. For Zarr format 3, this parameter is deprecated, because memory order is a runtime parameter for Zarr format 3 arrays. The recommended way to specify the memory @@ -2445,9 +2445,9 @@ def create( dict representations of [`zarr.abc.codec.ArrayArrayCodec`][]. For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the - the order if your filters is consistent with the behavior of each filter. + order of your filters is consistent with the behavior of each filter. - The default value of ``"auto"`` instructs Zarr to use a default used based on the data + The default value of ``"auto"`` instructs Zarr to use a default based on the data type of the array and the Zarr format specified. For all data types in Zarr V3, and most data types in Zarr V2, the default filters are empty. The only cases where default filters are not empty is when the Zarr format is 2, and the data type is a variable-length data type like @@ -2460,7 +2460,7 @@ def create( filters are applied (if any are specified) and the data is serialized into bytes. For Zarr format 3, a "compressor" is a codec that takes a bytestream, and - returns another bytestream. Multiple compressors my be provided for Zarr format 3. + returns another bytestream. Multiple compressors may be provided for Zarr format 3. If no ``compressors`` are provided, a default set of compressors will be used. These defaults can be changed by modifying the value of ``array.v3_default_compressors`` in [`zarr.config`][]. @@ -2482,7 +2482,7 @@ def create( fill_value : Any, optional Fill value for the array. order : {"C", "F"}, optional - The memory of the array (default is "C"). + The memory order of the array (default is "C"). For Zarr format 2, this parameter sets the memory order of the array. For Zarr format 3, this parameter is deprecated, because memory order is a runtime parameter for Zarr format 3 arrays. The recommended way to specify the memory @@ -2589,9 +2589,9 @@ def create_array( dict representations of [`zarr.abc.codec.ArrayArrayCodec`][]. For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the - the order if your filters is consistent with the behavior of each filter. + order of your filters is consistent with the behavior of each filter. - The default value of ``"auto"`` instructs Zarr to use a default used based on the data + The default value of ``"auto"`` instructs Zarr to use a default based on the data type of the array and the Zarr format specified. For all data types in Zarr V3, and most data types in Zarr V2, the default filters are empty. The only cases where default filters are not empty is when the Zarr format is 2, and the data type is a variable-length data type like @@ -2604,7 +2604,7 @@ def create_array( filters are applied (if any are specified) and the data is serialized into bytes. For Zarr format 3, a "compressor" is a codec that takes a bytestream, and - returns another bytestream. Multiple compressors my be provided for Zarr format 3. + returns another bytestream. Multiple compressors may be provided for Zarr format 3. If no ``compressors`` are provided, a default set of compressors will be used. These defaults can be changed by modifying the value of ``array.v3_default_compressors`` in [`zarr.config`][zarr.config]. @@ -2626,7 +2626,7 @@ def create_array( fill_value : Any, optional Fill value for the array. order : {"C", "F"}, optional - The memory of the array (default is "C"). + The memory order of the array (default is "C"). For Zarr format 2, this parameter sets the memory order of the array. For Zarr format 3, this parameter is deprecated, because memory order is a runtime parameter for Zarr format 3 arrays. The recommended way to specify the memory diff --git a/src/zarr/core/sync.py b/src/zarr/core/sync.py index 7bcb0bf034..260d4ad841 100644 --- a/src/zarr/core/sync.py +++ b/src/zarr/core/sync.py @@ -198,7 +198,7 @@ def collect_aiterator[T](data: AsyncIterator[T]) -> tuple[T, ...]: class SyncMixin: def _sync[T](self, coroutine: Coroutine[Any, Any, T]) -> T: - # TODO: refactor this to to take *args and **kwargs and pass those to the method + # TODO: refactor this to take *args and **kwargs and pass those to the method # this should allow us to better type the sync wrapper return sync( coroutine,