From 0e0ccc32a5e8fa822003f88b7a10981c0ccb0f24 Mon Sep 17 00:00:00 2001 From: Sean Arms <67096+lesserwhirls@users.noreply.github.com> Date: Fri, 15 Aug 2025 09:58:24 -0600 Subject: [PATCH 1/2] Use v0.0.6 of the unidata-jekyll-theme --- docs/build.gradle | 2 +- project-files/jenkins/pipelines/docs | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/build.gradle b/docs/build.gradle index c3fd43dee1..323f30e337 100644 --- a/docs/build.gradle +++ b/docs/build.gradle @@ -115,7 +115,7 @@ gradle.projectsEvaluated { // Several statements below rely upon all subproject apply from: "$rootDir/gradle/any/properties.gradle" // For Nexus credential properties. -String docTheme = "unidata-jekyll-docs:0.0.5" +String docTheme = "unidata-jekyll-docs:0.0.6" boolean isGitHub = System.getenv('GITHUB_ACTIONS') as boolean String imageBaseUrl = "docker.unidata.ucar.edu" diff --git a/project-files/jenkins/pipelines/docs b/project-files/jenkins/pipelines/docs index c16e4795be..3caf5c61a2 100644 --- a/project-files/jenkins/pipelines/docs +++ b/project-files/jenkins/pipelines/docs @@ -8,7 +8,7 @@ pipeline { -e DOCS_UID=$(id -u) \ -v .:/netcdf-java \ -v ./docs/build/site:/site \ - docker.unidata.ucar.edu/unidata-jekyll-docs:0.0.5 build + docker.unidata.ucar.edu/unidata-jekyll-docs:0.0.6 build ''' } } From 9d06ccdb378fffd23fcfb0f93abe205945ec131e Mon Sep 17 00:00:00 2001 From: Sean Arms <67096+lesserwhirls@users.noreply.github.com> Date: Fri, 15 Aug 2025 13:49:29 -0600 Subject: [PATCH 2/2] Update docs to use annotate code blocks --- .../pages/netcdfJava/developer/DiskCaching.md | 135 +++--- .../netcdfJava/gribfeaturecollections.md | 430 +++++++++--------- .../runtime/runtimeloading.md | 57 +-- 3 files changed, 324 insertions(+), 298 deletions(-) diff --git a/docs/src/site/pages/netcdfJava/developer/DiskCaching.md b/docs/src/site/pages/netcdfJava/developer/DiskCaching.md index 3c985a3820..a1c27782f3 100644 --- a/docs/src/site/pages/netcdfJava/developer/DiskCaching.md +++ b/docs/src/site/pages/netcdfJava/developer/DiskCaching.md @@ -1,6 +1,6 @@ --- title: Disk Caching -last_updated: 2018-10-10 +last_updated: 2025-08-15 sidebar: netcdfJavaTutorial_sidebar toc: false permalink: disk_caching.html @@ -12,72 +12,73 @@ permalink: disk_caching.html #### Writing temporary files using DiskCache -There are a number of places where the CDM library needs to write temporary files to disk. If you end up using the file more than once, its useful to save these files. The CDM uses static methods in _ucar.nc2.util.DiskCache_ to manage how the temporary files are managed. +There are a number of places where the CDM library needs to write temporary files to disk. If you end up using the file more than once, its useful to save these files. The CDM uses static methods in `ucar.nc2.util.DiskCache` to manage how the temporary files are managed. Before the CDM writes the temporary file, it looks to see if it already exists. -1. If a filename ends with _".Z", ".zip", ".gzip", ".gz", or ".bz2", NetcdfFile.open_ will write an uncompressed file of the same name, but without the suffix. +1. If a filename ends with `.Z`, `.zip`, `.gzip`, `.gz`, or `.bz2`, `NetcdfFile.open` will write an uncompressed file of the same name, but without the suffix. -2. _Nexrad2_, _Cinrad2_ files that are compressed will be uncompressed to a file with an _.uncompress_ prefix. -By default, DiskCache prefers to place the temporary file in the same directory as the original file. If it does not have write permission in that directory, by default it will use the directory _${user_home}/.unidata/cache/_. You can change the directory by calling +2. `Nexrad2`, `Cinrad2` files that are compressed will be uncompressed to a file with an `.uncompress` prefix. +By default, DiskCache prefers to place the temporary file in the same directory as the original file. If it does not have write permission in that directory, by default it will use the directory `${user_home}/.unidata/cache/`. You can change the directory by calling -_ucar.nc2.util.DiskCache.setRootDirectory(rootDirectory)._ +`ucar.nc2.util.DiskCache.setRootDirectory(rootDirectory).` You might want to always write temporary files to the cache directory, in order to manage them in a central place. To do so, call -_ucar.nc2.util.DiskCache.setCachePolicy( boolean alwaysInCache)_ with parameter _alwaysInCache = true_. +`ucar.nc2.util.DiskCache.setCachePolicy( boolean alwaysInCache)` with parameter `alwaysInCache = true`. -You may want to limit the amount of space the disk cache uses (unless you always have data in writeable directories, so that the disk cache is never used). To scour the cache, call _DiskCache.cleanCache()_. There are several variations of the cleanup: +You may want to limit the amount of space the disk cache uses (unless you always have data in writeable directories, so that the disk cache is never used). To scour the cache, call `DiskCache.cleanCache()`. There are several variations of the cleanup: -* _DiskCache.cleanCache(Date cutoff, StringBuilder sbuff)_ will delete files older than the cutoff date. -* _DiskCache.cleanCache(long maxBytes, StringBuilder sbuff)_ will retain maxBytes bytes, deleting oldest files first. -* _DiskCache.cleanCache(long maxBytes, Comparator fileComparator, StringBuilder sbuff)_ will retain maxBytes bytes, deleting files in the order defined by your Comparator. +* `DiskCache.cleanCache(Date cutoff, StringBuilder sbuff)` will delete files older than the cutoff date. +* `DiskCache.cleanCache(long maxBytes, StringBuilder sbuff)` will retain maxBytes bytes, deleting oldest files first. +* `DiskCache.cleanCache(long maxBytes, Comparator fileComparator, StringBuilder sbuff)` will retain maxBytes bytes, deleting files in the order defined by your Comparator. -For long running application, you might want to do this periodically in a background timer thread, as in the following example. +For a long-running application, you might want to do this periodically in a background timer thread, as in the following example. -~~~ -1) Calendar c = Calendar.getInstance(); // contains current startup time - c.add( Calendar.MINUTE, 30); // add 30 minutes to current time // run task every 60 minutes, starting 30 minutes from now -2) java.util.Timer timer = new Timer(); - timer.scheduleAtFixedRate( new CacheScourTask(), c.getTime(), (long) 1000 * 60 * 60 ); - -3) private class CacheScourTask extends java.util.TimerTask { - public void run() { - StringBuffer sbuff = new StringBuffer(); -4) DiskCache.cleanCache(100 * 1000 * 1000, sbuff); // 100 Mbytes - sbuff.append("----------------------\n"); -5) log.info(sbuff.toString()); - } - } - ... - // upon exiting -6) timer.cancel(); -~~~ -1. Get the current time and add 30 minutes to it -2. Start up a timer that executes every 60 minutes, starting in 30 minutes -3. Your class must extend TimerTask, the run method is called by the Timer -4. Scour the cache, allowing 100 Mbytes of space to be retained -5. Optionally log a message with the results of the scour. -6. Make sure you cancel the timer before your application exits, or else the process will not terminate. +~~~java +// Get the current time and add 30 minutes to it +Calendar c = Calendar.getInstance(); +c.add(Calendar.MINUTE, 30); + +// your class must extend TimerTask, the run method is called by the Timer +private class CacheScourTask extends java.util.TimerTask { + public void run() { + StringBuffer sbuff = new StringBuffer(); + //Scour the cache, allowing 100 Mbytes of space to be retained + DiskCache.cleanCache(100 * 1000 * 1000, sbuff); + sbuff.append("----------------------\n"); + // Optionally log a message with the results of the scour + log.info(sbuff.toString()); + } +} + +// Start up a timer that executes every 60 minutes, starting in 30 minutes +java.util.Timer timer = new Timer(); +timer.scheduleAtFixedRate(new CacheScourTask(), c.getTime(), (long) 1000 * 60 * 60); + +// make sure you cancel the timer before your application exits, or else +// the process will not terminate +timer.cancel(); +~~~ #### Writing temporary files using DiskCache2 -In a number of places, the _ucar.nc2.util.DiskCache2_ class is used to control caching. This does not use static methods, so can be configured for each individual use. +In a number of places, the `ucar.nc2.util.DiskCache2` class is used to control caching. This does not use static methods, so can be configured for each individual use. -The default constructor mimics DiskCache, using _${user_home}/.unidata/cache/_ as the root directory: +The default constructor mimics DiskCache, using `${user_home}/.unidata/cache/` as the root directory: -_DiskCache2 dc2 = new DiskCache2();_ +`DiskCache2 dc2 = new DiskCache2();` You can change the root directory by calling -_dc2.setRootDirectory(rootDirectory)._ +`dc2.setRootDirectory(rootDirectory).` You can tell the class to scour itself in a background timer by using the constructor: -_DiskCache2 dc2 = new DiskCache2(rootDirectory, false, 24 * 60, 60);_ +`DiskCache2 dc2 = new DiskCache2(rootDirectory, false, 24 * 60, 60);` -~~~ +~~~java /** * Create a cache on disk. * @param root the root directory of the cache. Must be writeable. @@ -85,78 +86,78 @@ You can tell the class to scour itself in a background timer by using the constr * @param persistMinutes a file is deleted if its last modified time is greater than persistMinutes * @param scourEveryMinutes how often to run the scour process. If <= 0, dont scour. */ - public DiskCache2(String root, boolean relativeToHome, int persistMinutes, int scourEveryMinutes); +public DiskCache2(String root, boolean relativeToHome, int persistMinutes, int scourEveryMinutes); ~~~ You can change the cache policy from the default CachePathPolicy.OneDirectory by (eg): -~~~ +~~~java dc2.setCachePathPolicy(CachePathPolicy.NestedTruncate, null). - /** - * Set the cache path policy - * @param cachePathPolicy one of: - * OneDirectory (default) : replace "/" with "-", so all files are in one directory. - * NestedDirectory: cache files are in nested directories under the root. - * NestedTruncate: eliminate leading directories - * - * @param cachePathPolicyParam for NestedTruncate, eliminate this string - */ - public void setCachePathPolicy(CachePathPolicy cachePathPolicy, String cachePathPolicyParam); +/** +* Set the cache path policy +* @param cachePathPolicy one of: +* OneDirectory (default) : replace "/" with "-", so all files are in one directory. +* NestedDirectory: cache files are in nested directories under the root. +* NestedTruncate: eliminate leading directories +* +* @param cachePathPolicyParam for NestedTruncate, eliminate this string +*/ +public void setCachePathPolicy(CachePathPolicy cachePathPolicy, String cachePathPolicyParam); ~~~ You can ensure that the cache is always used with: -_dc2.setCacheAlwaysUsed(true);_ +`dc2.setCacheAlwaysUsed(true);` Otherwise, the cache will try to write the temporary file in the same directory as the data file, and only use the cache if that directory is not writeable. ### GRIB Indexing and Caching -In 4.3 and above, for each GRIB file the CDM writes a _grib index file_ using the filename plus suffix _.gbx9_. So a file named _filename.grib1_ will have an index file _filename.grib1.gbx9_ created for it the first time that its read. Usually a _cdm index file_ is also created, using the filename plus suffix _.ncx_. So a file named filename.grib1 will have an index file filename.grib1.ncx created for it the first time. When a GRIB file is only part of a collection of GRIB files, then the ncx file may be created only for the collection. +In 4.3 and above, for each GRIB file the CDM writes a _grib index file_ using the filename plus suffix `.gbx9`. So a file named `filename.grib1` will have an index file `filename.grib1.gbx9` created for it the first time that its read. Usually a _cdm index file_ is also created, using the filename plus suffix `.ncx`. So a file named filename.grib1 will have an index file filename.grib1.ncx created for it the first time. When a GRIB file is only part of a collection of GRIB files, then the ncx file may be created only for the collection. The location of these index files is controlled by a caching strategy. The default strategy is to try to place the index files in the same directory as the data file. If that directory is not writeable, then the default strategy is to write the index files in the default caching directory. In a client application using the CDM, that default will be -_${user_home}/.unidata/cache/._ +`${user_home}/.unidata/cache/.` On the TDS it will be -_${tomcat_home}/content/thredds/cache/cdm_ +`${tomcat_home}/content/thredds/cache/cdm` Clients of the CDM can change the GRIB caching behavior by configuring a DiskCache2 and calling: -_ucar.nc2.grib.GribCollection.setDiskCache2(DiskCache2 dc);_ +`ucar.nc2.grib.GribCollection.setDiskCache2(DiskCache2 dc);` ### Object Caching #### NetcdfFileCache NetcdfFile objects are cached in memory for performance. When acquired, the object is locked so another thread cannot use. When closed, the lock is removed. When the cache is full, older objects are removed from the cache, and all resources released. -Note that typically a _java.io.RandomAccessFile_ object, holding an OS file handle, is open while its in the cache. You must make sure that your cache size is not so large such that you run out of file handles due to NetcdfFile object caching. Most aggregations do not hold more than one file handle open, no matter how many files are in the aggregation. The exception to that is a Union aggregation, which holds each of the files in the union open for the duration of the NetcdfFile object. +Note that typically a `java.io.RandomAccessFile` object, holding an OS file handle, is open while its in the cache. You must make sure that your cache size is not so large such that you run out of file handles due to NetcdfFile object caching. Most aggregations do not hold more than one file handle open, no matter how many files are in the aggregation. The exception to that is a Union aggregation, which holds each of the files in the union open for the duration of the NetcdfFile object. Holding a file handle open also creates a read lock on some operating systems, which will prevent the file from being opened in write mode. To enable caching, you must first call -~~~ - NetcdfDataset.initNetcdfFileCache(int minElementsInMemory, int maxElementsInMemory, int period); +~~~java +NetcdfDataset.initNetcdfFileCache(int minElementsInMemory, int maxElementsInMemory, int period); ~~~ -where _minElementsInMemory_ are the number of objects to keep in the cache when cleaning up, maxElementsInMemory triggers a cleanup if the cache size goes over it, and period specifies the time in seconds to do periodic cleanups. +where `minElementsInMemory` is the number of objects to keep in the cache when cleaning up, `maxElementsInMemory` triggers a cleanup if the cache size goes over it, and `period` specifies the time in seconds to do periodic cleanups. After enabling, you can disable with: -~~~ +~~~java NetcdfDataset.disableNetcdfFileCache(); ~~~ However, you cant reenable after disabling. -Setting _minElementsInMemory_ to zero will remove all files not currently in use every _period_ seconds. +Setting `minElementsInMemory` to zero will remove all files not currently in use every `period` seconds. Normally the cleanup is done is a background thread to not interferre with your application, and the maximum elements is approximate. When resources such as file handles must be carefully managed, you can set a hard limit with this call: -~~~ - NetcdfDataset.initNetcdfFileCache(int minElementsInMemory, int maxElementsInMemory, int hardLimit, int period); +~~~java +NetcdfDataset.initNetcdfFileCache(int minElementsInMemory, int maxElementsInMemory, int hardLimit, int period); ~~~ so that as soon as the number of NetcdfFile objects exceeds hardLimit , a cleanup is done immediately in the calling thread. \ No newline at end of file diff --git a/docs/src/site/pages/netcdfJava/gribfeaturecollections.md b/docs/src/site/pages/netcdfJava/gribfeaturecollections.md index e6c15dcbee..5a64d7eb0c 100644 --- a/docs/src/site/pages/netcdfJava/gribfeaturecollections.md +++ b/docs/src/site/pages/netcdfJava/gribfeaturecollections.md @@ -1,6 +1,6 @@ --- title: GRIB Feature Collections -last_updated: 2019-11-05 +last_updated: 2025-08-15 sidebar: netcdfJavaTutorial_sidebar permalink: grib_feature_collections_ref.html toc: false @@ -13,7 +13,7 @@ New indexing scheme allows fast access and scalability to very large datasets. Multiple horizontal domains are supported and placed into separate groups. Interval time coordinates are fully supported. -### Version 4.5 +## Version 4.5 The GRIB Collections framework has been rewritten in CDM version 4.5, in order to handle large collections efficiently. Version 4.5 requires Java 7. Some of the new capabilities in version 4.5 are: @@ -21,7 +21,7 @@ GRIB Collections now keep track of both the reference time and valid time. The c A collection with a single reference time will have a single partition with a single time coordinate. A collection with multiple reference times will have partitions for each reference time, plus a PartitionCollection that represents the entire collection. Very large collections should be partitioned by directory and/or file, creating a tree of partitions. A PartitionCollection has two datasets (kept in separate groups), the TwoD and the Best dataset. -The TwoD dataset has two time coordinates - reference time (aka run time) and forecast time (aka valid time), corresponding to FMRC TwoD datasets. The forecast time is two dimensional, corresponding to all the times available for each reference time. +The TwoD dataset has two time coordinates - reference time (aka run time) and forecast time (aka valid time), corresponding to FMRC TwoD datasets. The forecast time is two-dimensional, corresponding to all the times available for each reference time. The Best dataset has a single forecast time coordinate, the same as 4.3 GRIB Collections and FMRC Best datasets. If there are multiple GRIB records corresponding to the same forecast time, the record with the smallest offset from its reference time is used. Implementation notes: @@ -33,121 +33,117 @@ The cdm indexing uses extension .ncx2, in order to coexist with the .ncx indexes For large collections, especially if they change, the THREDDS Data Manager (TDM) must be run as a separate process to update the index files. Generally it is strongly recommended to run the TDM, and configure the TDS to only read and never write the indexes. Collections in the millions of records are now feasible. Java 7 NIO2 package is used to efficiently scan directories. -### Version 4.6 +## Version 4.6 -The GRIB Collections framework has been rewritten in CDM version 4.6, in order to handle very large collections efficiently. Oh wait we already did that in 4.5. Sorry, it wasnt good enough. +The GRIB Collections framework has been rewritten in CDM version 4.6, in order to handle very large collections efficiently. Oh, wait, we already did that in 4.5. Sorry, it wasn't good enough. Collection index files now use the suffix ncx3. These will be rewritten first time you access the files. The gbx9 files do NOT need to be rewritten, which is good because those are the slow ones. TimePartition can now be set to directory (default), file, a time period, or none. Details here. -Multiple reference times are handled more efficiently, eg only one index file typically needs to be written. +Multiple reference times are handled more efficiently, e.g. only one index file typically needs to be written. Global attributes promoted to dataset properties in the catalog Internal changes: Internal memory use has been reduced. Runtime objects are now immutable, which makes caching possible. RandomAccessFiles are kept in a separate pool, so they can be cached independent of the Collection objects. (IN PROGRESS FOR VERSION 5) DefaultServices. One can use the service name "DefaultServices" to use the default services for that Feature Type. -If you dont specify the service name, DefaultServices is used as the default. +If you don't specify the service name, DefaultServices is used as the default. DefaultServices use all enabled services appropriate to that Feature Type. -Also see: -## Feature Collection overview +## Example 1 (timePartition="none") -GRIB specific configuration -GRIB Collection FAQs -CDM GRIB Collection Processing - -### Example 1 (timePartition="none"): -~~~ -1) -2) -3) GRIB-2 +{% highlight_with_annotations xml %} +{% raw %}{% annotation 1 %}{% endraw %} + {% raw %}{% annotation 2 %}{% endraw %} + GRIB-2{% raw %}{% annotation 3 %}{% endraw %} all Grid - + -4) + {% raw %}{% annotation 4 %}{% annotation 5 %}{% annotation 6 %}{% endraw %} -7) -8) -9) + {% raw %}{% annotation 7 %}{% endraw %} + {% raw %}{% annotation 8 %}{% endraw %} + {% raw %}{% annotation 9 %}{% endraw %} -~~~ +{% endhighlight_with_annotations %} + +* {% annotation 1 %} A featureCollection must have a name, a featureType and a path (do not set an ID attribute). Note that the featureType attribute must now equal GRIB1 or GRIB2, not plain GRIB. +* {% annotation 2 %} A featureCollection is an InvDataset, so it can contain any elements an InvDataset can contain. It must have or inherit a default service. +* {% annotation 3 %} The collection must consist of either GRIB-1 or GRIB-2 files (not both). You no longer should set the dataFormat element to indicate which, as it is specified in the featureType, and will be added automatically. +* {% annotation 4 %} The collection name should be short but descriptive, it must be unique across all collections on your TDS, and should not change. +* {% annotation 5 %} The collection specification defines the collection of files that are in this dataset. +* {% annotation 6 %} The partitionType is none. +* {% annotation 7 %} This update element tells the TDS to use the existing indices, and to read them only when an external trigger is sent. This is the default behavior as of 4.5.4. +* {% annotation 8 %} This tdm element tells the TDM to test every 15 minutes if the collection has changed, and to rewrite the indices and send a trigger to the TDS when it has changed. +* {% annotation 9 %} GRIB specific configuration. -A featureCollection must have a name, a featureType and a path (do not set an ID attribute). Note that the featureType attribute must now equal GRIB1 or GRIB2, not plain GRIB. -A featureCollection is an InvDataset, so it can contain any elements an InvDataset can contain. It must have or inherit a default service. -The collection must consist of either GRIB-1 or GRIB-2 files (not both). You no longer should set the dataFormat element to indicate which, as it is specified in the featureType, and will be added automatically. -The collection name should be short but descriptive, it must be unique across all collections on your TDS, and should not change. -The collection specification defines the collection of files that are in this dataset. -The partitionType is none. -This update element tells the TDS to use the existing indices, and to read them only when an external trigger is sent. This is the default behavior as of 4.5.4. -This tdm element tells the TDM to test every 15 minutes if the collection has changed, and to rewrite the indices and and send a trigger to the TDS when it has changed. -GRIB specific configuration. -Resulting Datasets: -The above example generates a TwoD and Best dataset for the entire collection, a reference to the latest datset, as well as one dataset for each reference time in the collection, which become nested datasets in the catalog. These datasets are named by their index files, in the form ..ncx3, eg GFS-Puerto_Rico-20141110-000000.ncx3 +### Resulting Datasets +The above example generates a TwoD and Best dataset for the entire collection, a reference to the latest dataset, as well as one dataset for each reference time in the collection, which become nested datasets in the catalog. These datasets are named by their index files, in the form ..ncx3, e.g. GFS-Puerto_Rico-20141110-000000.ncx3 The simplified catalog is: -~~~ - - - VirtualServices - GRID - GRIB-2 - - - Two time dimensions: reference and forecast; full access to all GRIB records - - - Single time dimension: for each forecast time, use GRIB record with smallest offset from reference time - - - latest - - - - + +~~~xml + + + VirtualServices + GRID + GRIB-2 + + + Two time dimensions: reference and forecast; full access to all GRIB records + + Single time dimension: for each forecast time, use GRIB record with the smallest offset from reference time + + + latest + + + + + ~~~ The catalogRefs are links to virtual datasets, formed from the collection of records for the specified reference time, and independent of which file stores them. -### Example 2 (timePartition="directory"): +## Example 2 (timePartition="directory") Now suppose that we modify the above example and use timePartition="directory": -~~~ +{% highlight_with_annotations xml %} all Grid - - - + + + + + + GRIB-2 + + {% raw %}{% annotation 1 %}{% annotation 2 %}{% endraw %} + {% raw %}{% annotation 3 %}{% endraw %} +{% endhighlight_with_annotations %} - - - GRIB-2 - - -3) - -~~~ -The collection is divided into partitions. In this case, each file becomes a separate partition. In order to use this, each file must contain GRIB records from a single runtime. -The starting time of the partition must be encoded into the filename. One must define a date extractor in the collection specification, or by using a dateFormatMark, as in this example. -In this example, the collection is readied when the server starts up. Manual triggers for updating are enabled. +* {% annotation 1 %} The collection is divided into partitions. In this case, each file becomes a separate partition. In order to use this, each file must contain GRIB records from a single runtime. +* {% annotation 3 %} The starting time of the partition must be encoded into the filename. One must define a date extractor in the collection specification, or by using a dateFormatMark, as in this example. +* {% annotation 3 %} In this example, the collection is readied when the server starts up. Manual triggers for updating are enabled. -### Resulting Datasets: +### Resulting Datasets A time partition generates one collection dataset, one dataset for each partition, and one dataset for each individual file in the collection: -~~~ - + +~~~xml + @@ -157,74 +153,83 @@ A time partition generates one collection dataset, one dataset for each partitio ... -de-referencing the catalogRefs, and simplifying: - -1) -2) -3) +~~~ + +De-referencing the catalogRefs, and simplifying: +{% highlight_with_annotations xml %} + + {% raw %}{% annotation 1 %}{% endraw %} + {% raw %}{% annotation 2 %}{% endraw %} + {% raw %}{% annotation 3 %}{% endraw %} ... - -4) - - + + {% raw %}{% annotation 4 %}{% endraw %} + ... - - ... + -~~~ +{% endhighlight_with_annotations %} + +* {% annotation 1 %} The overall collection dataset +* {% annotation 2 %} The first partition collection, with a partitionName = `name_startingTime` +* {% annotation 3 %} The files in the first partition +* {% annotation 4 %} The second partition collection, etc + +So the datasets that are generated from a `Time Partition` with `name`, `path`, and `partitionName`: -The overall collection dataset -The first partition collection, with a partitionName = name_startingTime -The files in the first partition -The second partition collection, etc -So the datasets that are generated from a Time Partition with name, path, and partitionName: +| dataset type | catalogRef | name | path | +|:-----------------|:-------------------------------------|:--------------|:------------------------------| +| collection | path/collection/catalog.xml | name | path/name/collection | +| partitions | path/partitionName/catalog.xml | partitionName | path/partitionName/collection | +| individual files | path/partitionName/files/catalog.xml | filename | path/files/filename | -dataset catalogRef name path -|: -collection path/collection/catalog.xml name path/name/collection -partitions path/partitionName/catalog.xml partitionName path/partitionName/collection -individual files path/partitionName/files/catalog.xml filename path/files/filename -Example 3 (Multiple Groups) : -When a Grib Collection contains multiple horizontal domains (i.e. distinct Grid Definition Sections (GDS)), each domain gets placed into a separate group. As a rule, one can't tell if there are separate domains without reading the files. If you open this collection through the CDM (eg using ToolsUI) you would see a dataset that contains groups. The TDS, however, separates groups into different datasets, so that each dataset has only a single (unnamed, aka root) group. +## Example 3 (Multiple Groups) + +When a Grib Collection contains multiple horizontal domains (i.e. distinct Grid Definition Sections (GDS)), each domain gets placed into a separate group. As a rule, one can't tell if there are separate domains without reading the files. If you open this collection through the CDM (e.g. using ToolsUI) you would see a dataset that contains groups. The TDS, however, separates groups into different datasets, so that each dataset has only a single (unnamed, aka root) group. ~~~ - - - GRIB-1 - all - - -1) - - - - - - - - - - - - - - - - - - - - + ~~~ -This dataset has many different groups, and we are using a element to name them (see below for details). +{% highlight_with_annotations xml %} + + + all + + + {% raw %}{% annotation 1 %}{% endraw %} + + + + + + + + + + + + + + + + + + + +{% endhighlight_with_annotations %} + +* {% annotation 1 %} This dataset has many different groups, and we are using a element to name them. + +### Resulting Datasets: -Resulting Datasets: For each group, this generates one collection dataset, and one dataset for each individual file in the group: -~~~ + +~~~xml @@ -234,100 +239,119 @@ For each group, this generates one collection dataset, and one dataset for each ... -Note that the groups are sorted by name, and that there is no overall collection for the dataset. Simplifying: +~~~ + +Note that the groups are sorted by name, and that there is no overall collection for the dataset. +Simplifying: + +{% highlight_with_annotations xml %} -1) -2) + {% raw %}{% annotation 1 %}{% endraw %} + {% raw %}{% annotation 2 %}{% endraw %} - ... + ... -3) + {% raw %}{% annotation 3 %}{% endraw %} ... - ... - -~~~ + ... + +{% endhighlight_with_annotations %} + + +* {% annotation 1 %} The first group collection dataset +* {% annotation 2 %} The files in the first group +* {% annotation 3 %} The second group collection dataset, etc -The first group collection dataset -The files in the first group -The second group collection dataset, etc -So the datasets that are generated from a Grib Collection with groupName and path : +So the datasets that are generated from a `Grib Collection` with `groupName` and `path` : -dataset catalogRef name path -group collection groupName path/groupName/collection -individual files path/groupName/files/catalog.xml filename path/files/filename +| dataset | catalogRef | name | path | +|:-----------------|:---------------------------------|:--------------------------|:--------------------| +| group collection | groupName | path/groupName/collection | +| individual files | path/groupName/files/catalog.xml | filename | path/files/filename | + +## Example 4 (Time Partition with Multiple Groups): -### Example 4 (Time Partition with Multiple Groups): Here is a time partitioned dataset with multiple groups: -~~~ - - - GRIB-2 - - - - -3) -4) - - ... -5) - - -~~~ -Partition the files by which directory they are in (the files must be time partitioned by the directories) -One still needs a date extractor from the filename, even when using a directory partition. -Minor errors in GRIB coding can create spurious differernces in the GDS. Here we correct one such problem (see below for details). -Group renaming as in example 2 -Exclude GRIB records that have a time coordinate interval of (0,0) (see below for details). -Resulting Datasets: +{% highlight_with_annotations xml %} + + + GRIB-2 + + {% raw %}{% annotation 1 %}{% annotation 2 %}{% endraw %} + + + {% raw %}{% annotation 3 %}{% endraw %} + {% raw %}{% annotation 4 %}{% endraw %} + + ... + {% raw %}{% annotation 5 %}{% endraw %} + + +{% endhighlight_with_annotations %} + +* {% annotation 1 %} Partition the files by which directory they are in (the files must be time partitioned by the directories) +* {% annotation 2 %} One still needs a date extractor from the filename, even when using a directory partition. +* {% annotation 3 %} Minor errors in GRIB coding can create spurious differences in the GDS. Here we correct one such problem (see below for details). +* {% annotation 4 %} Group renaming as in example 2 +* {% annotation 5 %} Exclude GRIB records that have a time coordinate interval of (0,0) (see below for details). + +### Resulting Datasets: + A time partition with multiple groups generates an overall collection dataset for each group, a collection dataset for each group in each partition, and a dataset for each individual file: - -1) -4) -8) +{% highlight_with_annotations xml %} + + {% raw %}{% annotation 1 %}{% endraw %} + {% raw %}{% annotation 4 %}{% endraw %} + {% raw %}{% annotation 8 %}{% endraw %} ... -de-referencing the catalogRefs, and simplifying: - +{% endhighlight_with_annotations %} -1) -2) -3) +De-referencing the catalogRefs, and simplifying: + +{% highlight_with_annotations xml %} + + {% raw %}{% annotation 1 %}{% endraw %} + {% raw %}{% annotation 2 %}{% endraw %} + {% raw %}{% annotation 3 %}{% endraw %} ... -4) -5) -6) - + {% raw %}{% annotation 4 %}{% endraw %} + {% raw %}{% annotation 5 %}{% endraw %} + {% raw %}{% annotation 6 %}{% endraw %} -7) - + {% raw %}{% annotation 7 %}{% endraw %} + ... -8) - ... - + {% raw %}{% annotation 8 %}{% endraw %} + ... + -Container for the overall collection datasets -The overall collection for the first group -The overall collection for the second group, etc -Container for the first partition -The collection dataset for the first group of the first partition -The individual files for the first group of the first partition, etc -The collection dataset for the second group of the first partition, etc. -Container for the second partition, etc -So the datasets that are generated from a Time Partition with name, path, groupName, and partitionName: - -dataset catalogRef name path -overall collection for group path/groupName/collection/catalog.xml groupName path/name/groupName -collection for partition and group path/partitionName/catalog.xml groupName path/partitionName/groupName -individual files path/partitionName/groupName/files/catalog.xml partitionName/filename path/files/filename +{% endhighlight_with_annotations %} + +* {% annotation 1 %} Container for the overall collection datasets +* {% annotation 2 %} The overall collection for the first group +* {% annotation 3 %} The overall collection for the second group, etc +* {% annotation 4 %} Container for the first partition +* {% annotation 5 %} The collection dataset for the first group of the first partition +* {% annotation 6 %} The individual files for the first group of the first partition, etc +* {% annotation 7 %} The collection dataset for the second group of the first partition, etc. +* {% annotation 8 %} Container for the second partition, etc + +So the datasets that are generated from a `Time Partition` with `name`, `path`, `groupName`, and `partitionName`: + +| dataset | catalogRef | name | path | +|:-----------------------------------|:-----------------------------------------------|:-----------------------|:-----------------------------| +| overall collection for group | path/groupName/collection/catalog.xml | groupName | path/name/groupName | +| collection for partition and group | path/partitionName/catalog.xml | groupName | path/partitionName/groupName | +| individual files | path/partitionName/groupName/files/catalog.xml | partitionName/filename | path/files/filename | diff --git a/docs/src/site/pages/netcdfJava_tutorial/runtime/runtimeloading.md b/docs/src/site/pages/netcdfJava_tutorial/runtime/runtimeloading.md index 98374f269a..9eea052ce1 100644 --- a/docs/src/site/pages/netcdfJava_tutorial/runtime/runtimeloading.md +++ b/docs/src/site/pages/netcdfJava_tutorial/runtime/runtimeloading.md @@ -1,6 +1,6 @@ --- title: Runtime loading -last_updated: 2025-08-12 +last_updated: 2025-08-15 sidebar: netcdfJavaTutorial_sidebar permalink: runtime_loading.html toc: false @@ -107,34 +107,35 @@ Instead of calling the above routines in your code, you can pass the CDM library Note that your application must call `ucar.nc2.util.xml.RuntimeConfigParser.read()`. The configuration file looks like this: -~~~xml - - -1) -2) -3) -4) -5) C:/grib/tables/ons288.xml -6) C:/grib/tables/ncepLookup.txt -7) -8) -9) - /usr/local/lib - netcdf - false - - -~~~ -1. Loads an `IOServiceProvider` with the given class name -2. Loads a `CoordSysBuilderIF` with the given class name, which looks for the given `Convention` attribute value. -3. Loads a `CoordTransformFactory` with the given class name, which looks for the given `transformName` in the dataset. The type must be vertical or projection. -4. Loads a `FeatureDatasetFactory` with the given class name which open `FeatureDatasets` of the given `featureType`. -5. Load a [GRIB-1 parameter table](grib_tables.html) (as of version 4.3) -6. Load a [GRIB-1 parameter table lookup](grib_tables.html#standard-table-mapping) (as of version 4.3) -7. Load a [BUFR table lookup](bufr_tables.html) file. -8. Turn [strict GRIB1 table handling](grib_tables.html#strict) off. -9. Configure how the [NetCDF-4 C library](netcdf4_c_library.html) is discovered and used. +{% highlight_with_annotations xml %} + + + {% raw %}{% annotation 1 %}{% endraw %} + {% raw %}{% annotation 2 %}{% endraw %} + {% raw %}{% annotation 3 %}{% endraw %} + {% raw %}{% annotation 4 %}{% endraw %} + C:/grib/tables/ons288.xml{% raw %}{% annotation 5 %}{% endraw %} + C:/grib/tables/ncepLookup.txt{% raw %}{% annotation 6 %}{% endraw %} + {% raw %}{% annotation 7 %}{% endraw %} + {% raw %}{% annotation 8 %}{% endraw %} + {% raw %}{% annotation 9 %}{% endraw %} + /usr/local/lib + netcdf + false + + +{% endhighlight_with_annotations %} + +* {% annotation 1 %} Loads an `IOServiceProvider` with the given class name +* {% annotation 2 %} Loads a `CoordSysBuilderIF` with the given class name, which looks for the given `Convention` attribute value. +* {% annotation 3 %} Loads a `CoordTransformFactory` with the given class name, which looks for the given `transformName` in the dataset. The type must be vertical or projection. +* {% annotation 4 %} Loads a `FeatureDatasetFactory` with the given class name which open `FeatureDatasets` of the given `featureType`. +* {% annotation 5 %} Load a [GRIB-1 parameter table](grib_tables.html) (as of version 4.3) +* {% annotation 6 %} Load a [GRIB-1 parameter table lookup](grib_tables.html#standard-table-mapping) (as of version 4.3) +* {% annotation 7 %} Load a [BUFR table lookup](bufr_tables.html) file. +* {% annotation 8 %} Turn [strict GRIB1 table handling](grib_tables.html#strict) off. +* {% annotation 9 %} Configure how the [NetCDF-4 C library](netcdf4_c_library.html) is discovered and used. * `libraryPath`: The directory in which the native library is installed. * `libraryName`: The name of the native library. This will be used to locate the proper `.DLL`, `.SO`, or `.DYLIB` file within the `libraryPath` directory. * `useForReading`: By default, the native library is only used for writing NetCDF-4 files; a pure-Java layer is responsible for reading them.