From 00f8354f1d149f5c1df5cc2f222bf6e2b902fac2 Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Tue, 31 Mar 2026 19:57:36 +0100 Subject: [PATCH 01/10] Add attribution metadat policy for BIG_DATA --- source/Development/jules_docs.rst | 6 +-- source/Reviewers/howtocommit.rst | 78 +++++++++++++++++++++---------- 2 files changed, 56 insertions(+), 28 deletions(-) diff --git a/source/Development/jules_docs.rst b/source/Development/jules_docs.rst index 21186dba..34e529bb 100644 --- a/source/Development/jules_docs.rst +++ b/source/Development/jules_docs.rst @@ -27,10 +27,10 @@ reStructuredText Extension for Fortran Namelists The JULES User Guide uses a custom extension to reStructuredText to allow a more natural expression of Fortran namelists -(see `doc/sphinxext/sphinx_nml_domain.py`_ if you are interested in -the implementation). +(see `doc/sphinxext/sphinx_nml_domain.py`_ if you are interested in the +implementation). -.. doc/sphinxext/sphinx_nml_domain.py: https://github.com/MetOffice/jules/blob/main/doc/sphinxext/sphinx_nml_domain.py +.. _doc/sphinxext/sphinx_nml_domain.py: https://github.com/MetOffice/jules/blob/main/doc/sphinxext/sphinx_nml_domain.py Documenting namelists --------------------- diff --git a/source/Reviewers/howtocommit.rst b/source/Reviewers/howtocommit.rst index d0a5e0a6..a1a6e973 100644 --- a/source/Reviewers/howtocommit.rst +++ b/source/Reviewers/howtocommit.rst @@ -22,7 +22,7 @@ of these steps outlined below. * `Repository Status`_ is used to coordinate ``main`` commits for all projects. - * This operates on a first-come-first-served queing system. + * This operates on a first-come-first-served queuing system. * To join the queue use the ``Add Item`` button. * Do not move yourself up the queue unless agreed with others. @@ -223,7 +223,7 @@ To update the test suite for an upgrade macro, please run: Do not push the changes at this stage. 3. Test (if no KGO) --------------------- +------------------- The amount of testing to be done at this stage depends on the complexity of the PR, and what has already been done. A minimum level is required for even @@ -324,19 +324,28 @@ are no clashes with what else has gone onto ``main``. .. code-block:: shell git pull - git switch - cd /user_guide/doc - conda activate jules-user-guide - make html - firefox build/html/index.html + git switch # optional + cd doc - To build and check the LaTeX PDF: + # Create and activate virtual environment + python3.12 -m venv .venv + .venv/bin/pip install . + source .venv/bin/activate + + # Generate html documentation in ./build/html + make clean html + + # At the Met Office you can also run the following to + # deploy the documents directly into ~/public_html/jules// + make clean deploy + + To generate PDF documentation, ensure you have a LaTeX distribution + installed and run the following command. The pdf will be generated as + ``./build/latex/JULES_User_Guide.pdf``: .. code-block:: shell make latexpdf - evince build/latex/JULES_User_Guide.pdf - 4. KGO & Supporting Data (if required) @@ -348,6 +357,25 @@ for all affected tests before you commit to the ``main``. Supporting data is stored in the filesystems of our machines and changes to use will require the reviewer to update those files (BIG DATA). +.. important:: **Attribution Metadata Policy** + + If the change requires a new or updated BIG DATA file then you will need + to work with the developer to ensure that data in BIG_DATA_DIR must include + clear attribution and licence metadata. Where possible, this should follow + existing UM ``ANCILDIR`` conventions, with ``.attribution`` and ``.license`` + files or equivalent NetCDF **global attributes** (at least, ``references``, + ``license``, ``source``, and ``history``). Attribution must reflect the + original data source and be provided by the data creators before deployment, + share, or distribution. + + It is treated as an **Information Asset / licensing requirement**, not just + a best practice. + + Please refer to the + `Prerequisites section of the ANCILDIR-Deploy document + `__. + + *NB: These instructions are Met Office specific, other sites may manage their KGO differently* @@ -392,8 +420,8 @@ KGO differently* #. The script will ask you to enter some details regarding the PR. - * Platforms: enter each platform which has a kgo change, lower case - and space seperated, e.g. `azspice ex1a` + * Platforms: enter each platform which has a KGO change, lower case + and space separated, e.g. `azspice ex1a` * If running on the EX's it will ask for the host you ran on - this can be found from Cylc Review. * Path to your local clone - the script will check this exists and @@ -419,16 +447,16 @@ KGO differently* * This script will login as the relevant admin user as needed * After running for a platform, the newly created variables.cylc and shell script will be moved to Azspice - $UMDIR/kgo_update_files/. + ``$UMDIR/kgo_update_files/``. * Having run on each requested platform the new variables.cylc files will be copied into your clone - rose-stem/site/meto/variables_.cylc. + ``rose-stem/site/meto/variables_.cylc``. .. dropdown:: Updating KGO manually (rarely needed!) * Create a new directory for the new KGO. The naming convention is vnXX.X_tNNNN, where NNNN is the PR number. The location of - the KGO for the nightly is $UMDIR/standard_jobs. + the KGO for the nightly is ``$UMDIR/standard_jobs``. * Copy the new KGO from your rose-stem run into the directory vnXX.X_tNNNN created above. Note that you need to provide a complete set of files, not just ones which have changed answers. @@ -437,8 +465,8 @@ KGO differently* the previous version (i.e. move the old file to the new KGO directory and replace it with a sym-link to the updated version) But do not do this if the old version was a major release (vnX.X), - this is to allow intermediate kgo installs to be deleted later. - * Remember to RSync and update the bitcomparison table(see above). + this is to allow intermediate KGO installs to be deleted later. + * Remember to RSync and update the bit comparison table(see above). .. tab-item:: JULES @@ -496,10 +524,10 @@ KGO differently* #. Verify the checksums updated properly by retriggering the failed checksums. First retrigger ``export-source``, and then when complete ``export-source_ex1a`` if new checksums are present there - (there is no need to retigger azspice). You may need to change the + (there is no need to retrigger azspice). You may need to change the maximum window extent of the gui in order to see the succeeded tasks. Now you can retrigger the failed checksums - these should - now pass if the kgo was updated in the clone correctly. + now pass if the KGO was updated in the clone correctly. .. important:: @@ -511,14 +539,14 @@ KGO differently* .. tip:: Between running any required testing and installing the KGO check that the - failing rose-ana tasks match those in the developers trac.log. If any have - failed for other reasons (e.g. timeout) then these should be re-triggered - before attempting to install the KGO files. + failing rose-ana tasks match those in the developers ``trac.log``. If any + have failed for other reasons (e.g. timeout) then these should be + re-triggered before attempting to install the KGO files. 4.1 Managing BIG DATA -^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^ -Static input data, such as initialisations and ancilliaries, are required by +Static input data, such as initialisations and ancillaries, are required by many tests. .. tab-set:: @@ -601,7 +629,7 @@ If the requirement is to update existing files, then further care is required. --------- Once testing has passed on the local Met Office machines then ensure all -changes for macros and kgos have been committed to the local copy of the +changes for macros and KGOs have been committed to the local copy of the branch and then push the changes back to the remote branch. .. tip:: From f3061cc27eba55fe223c750a7568b6d035418f92 Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 2 Apr 2026 17:59:50 +0100 Subject: [PATCH 02/10] Include process on adding test data --- source/Development/developing_change.rst | 1 + source/Development/testdata.rst | 109 +++++++++++++++++++++++ source/Reviewers/howtocommit.rst | 19 ---- 3 files changed, 110 insertions(+), 19 deletions(-) create mode 100644 source/Development/testdata.rst diff --git a/source/Development/developing_change.rst b/source/Development/developing_change.rst index 01326522..556cda27 100644 --- a/source/Development/developing_change.rst +++ b/source/Development/developing_change.rst @@ -54,6 +54,7 @@ carefully: kgo diagnostics rose_stem + testdata testing .. important:: diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst new file mode 100644 index 00000000..b54b002b --- /dev/null +++ b/source/Development/testdata.rst @@ -0,0 +1,109 @@ +.. ----------------------------------------------------------------------------- + (c) Crown copyright Met Office. All rights reserved. + The file LICENCE, distributed with this code, contains details of the terms + under which the code may be used. + ----------------------------------------------------------------------------- + +.. _testdata: + +Adding Test Data +================ + +.. note:: + + This page is a placeholder for information about test data. It is not yet + complete and will be updated in due course. + + *The instructions here are Met Office specific, other sites may manage their + test data differently.* + +.. important:: **Attribution Metadata Policy** + + If the change requires a new or updated file in ``LFRIC_DATA_DIR`` then you + will need to work with the Information Asset Owner (IAO) to ensure that data + in ``LFRIC_DATA_DIR`` must include clear attribution and licence metadata. + Where possible, this should follow existing UM ``ANCILDIR`` conventions (`see + below `_), with ``.attribution`` and ``.license`` + files or equivalent NetCDF **global attributes** (at least, ``references``, + ``license``, ``source``, and ``history``). Attribution must reflect the + original data source and be provided by the data creators before deployment, + share, or distribution. + + It is treated as an **Information Asset / licensing requirement**, not just + a best practice. + + +For UM related datasets, please Email the `MIAO team `_ +to discuss the best way to share the data. + +.. _prerequisites-section: + +Prerequisites +------------- + +Before adding test data, you should have a good understanding of the change you +are making and the tests you will be adding. You should also have a good +understanding of the codebase and the testing framework you will be using. + +Licenses +~~~~~~~~ + +All files will require a licence and a record of where they have come from, both +for legal and auditing purposes. In your request please mention how the files +was generated/produced and where as well as what licence it has, and what the +conditions of the licence are. + +Before files can be deployed we must get IAO approval, we cannot do this without +knowing the licence of the files to be deployed. + +Metadata +~~~~~~~~ + +Any file requirements should be recorded in or alongside the files being +deployed. + +Note that if a source file has a licence that imposes requirements on derived +works, then an ancillary file (or an intermediate file used to generate an +ancillary) does count as a derived work for the purpose of recording metadata. + +In cases where a file has been generated from multiple sources, it should be +made clear where each licence/attribution/acknowledgement has come from. + +NetCDF Files +^^^^^^^^^^^^ + +NetCDF files should have the relevant metadata included in the file itself. +The metadata should include the following information: + +* If there is a licence, it should be in a ``license`` global attribute as per + `ESIP Attribute Convention for Data Discovery `_. + +* If there is a paper attribution requirement, the relevant paper(s) should be + cited in the ``references`` global attribute as per + `CF conventions `_. + +* If there is an organisation attribution requirement, it should be in the + ``institution`` global attribute (again, as per CF). + +* If there is any other attribution requirement (e.g. for an individual), it + should be in the ``acknowledgement`` global attribute (again, as per ACCD). + +* If there are restrictions on usage (e.g. "research only"), these should be in + a ``restrictions`` global attribute. + +Other Files +^^^^^^^^^^^ + +* Licence should be in an accompanying plain text file with the same name as the + data file, but with a ``.license`` suffix. + +* Attribution should be in an accompanying plain text file with the same name as + the data file, but with a ``.attribution`` suffix. + +* Restrictions on usage (e.g. "research only") should be in an accompanying + plain text file with the same name as the data file, but with a + ``.restrictions`` suffix. + +If you have questions about the process or concerns about the provenance of the +data you want to include, please engage with the IAO as early as possible to +prevent delays to your change later on. diff --git a/source/Reviewers/howtocommit.rst b/source/Reviewers/howtocommit.rst index a1a6e973..659c3f10 100644 --- a/source/Reviewers/howtocommit.rst +++ b/source/Reviewers/howtocommit.rst @@ -357,25 +357,6 @@ for all affected tests before you commit to the ``main``. Supporting data is stored in the filesystems of our machines and changes to use will require the reviewer to update those files (BIG DATA). -.. important:: **Attribution Metadata Policy** - - If the change requires a new or updated BIG DATA file then you will need - to work with the developer to ensure that data in BIG_DATA_DIR must include - clear attribution and licence metadata. Where possible, this should follow - existing UM ``ANCILDIR`` conventions, with ``.attribution`` and ``.license`` - files or equivalent NetCDF **global attributes** (at least, ``references``, - ``license``, ``source``, and ``history``). Attribution must reflect the - original data source and be provided by the data creators before deployment, - share, or distribution. - - It is treated as an **Information Asset / licensing requirement**, not just - a best practice. - - Please refer to the - `Prerequisites section of the ANCILDIR-Deploy document - `__. - - *NB: These instructions are Met Office specific, other sites may manage their KGO differently* From ed9a99a1e1e0e8486245b62d81eed1e06fa0edf5 Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 14:43:21 +0100 Subject: [PATCH 03/10] Update source/Development/testdata.rst Apply CR suggestion Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com> --- source/Development/testdata.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index b54b002b..14f13d18 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -21,7 +21,7 @@ Adding Test Data If the change requires a new or updated file in ``LFRIC_DATA_DIR`` then you will need to work with the Information Asset Owner (IAO) to ensure that data - in ``LFRIC_DATA_DIR`` must include clear attribution and licence metadata. + in ``LFRIC_DATA_DIR`` includes clear attribution and licence metadata. Where possible, this should follow existing UM ``ANCILDIR`` conventions (`see below `_), with ``.attribution`` and ``.license`` files or equivalent NetCDF **global attributes** (at least, ``references``, From 812d7d8b9deddf4f5023878a88ce32ebbdb4d8af Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 14:44:08 +0100 Subject: [PATCH 04/10] Update source/Development/testdata.rst Apply CR suggestion Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com> --- source/Development/testdata.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index 14f13d18..832e6717 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -48,7 +48,7 @@ understanding of the codebase and the testing framework you will be using. Licenses ~~~~~~~~ -All files will require a licence and a record of where they have come from, both +All files require a licence and a record of where they have come from, both for legal and auditing purposes. In your request please mention how the files was generated/produced and where as well as what licence it has, and what the conditions of the licence are. From 3c29b5aba9cf9cde9f39a9ba87484a2acc671808 Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 14:44:45 +0100 Subject: [PATCH 05/10] Update source/Development/testdata.rst Apply CR suggestion Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com> --- source/Development/testdata.rst | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index 832e6717..12347a01 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -49,9 +49,8 @@ Licenses ~~~~~~~~ All files require a licence and a record of where they have come from, both -for legal and auditing purposes. In your request please mention how the files -was generated/produced and where as well as what licence it has, and what the -conditions of the licence are. +for legal and auditing purposes. In your request please describe where and how the +data was generated, and the terms and conditions of its licence. Before files can be deployed we must get IAO approval, we cannot do this without knowing the licence of the files to be deployed. From 15c63617b3b810ec9ee0d3fab6b68d838d7f2f2d Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 14:45:23 +0100 Subject: [PATCH 06/10] Update source/Development/testdata.rst Apply CR suggestion Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com> --- source/Development/testdata.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index 12347a01..39404cc5 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -52,8 +52,8 @@ All files require a licence and a record of where they have come from, both for legal and auditing purposes. In your request please describe where and how the data was generated, and the terms and conditions of its licence. -Before files can be deployed we must get IAO approval, we cannot do this without -knowing the licence of the files to be deployed. +Before any files can be deployed, they must be approved by an IAO and this cannot be done +without information about the licencing terms. Metadata ~~~~~~~~ From c824c7b0b97e4206deec64531512d88dde3d3b01 Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 14:45:39 +0100 Subject: [PATCH 07/10] Update source/Development/testdata.rst Apply CR suggestion Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com> --- source/Development/testdata.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index 39404cc5..21573723 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -58,7 +58,7 @@ without information about the licencing terms. Metadata ~~~~~~~~ -Any file requirements should be recorded in or alongside the files being +All file requirements should be recorded in or alongside the files being deployed. Note that if a source file has a licence that imposes requirements on derived From a4d2e66c04464da1ebc59282a261065b0035d58b Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 14:46:27 +0100 Subject: [PATCH 08/10] Update source/Development/testdata.rst Apply CR suggestion Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com> --- source/Development/testdata.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index 21573723..c82a3ab1 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -61,9 +61,9 @@ Metadata All file requirements should be recorded in or alongside the files being deployed. -Note that if a source file has a licence that imposes requirements on derived -works, then an ancillary file (or an intermediate file used to generate an -ancillary) does count as a derived work for the purpose of recording metadata. +If a source file has a licence that imposes requirements on derived +works, then any ancillary file (or an intermediate file used to generate an +ancillary) counts as a derived work for the purposes of recording metadata. In cases where a file has been generated from multiple sources, it should be made clear where each licence/attribution/acknowledgement has come from. From 77a3638f2d028d0781c584c9200ebb9f998a6564 Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 14:57:52 +0100 Subject: [PATCH 09/10] Update source/Development/testdata.rst Apply CR suggestion. Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com> --- source/Development/testdata.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index c82a3ab1..089436dc 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -74,7 +74,7 @@ NetCDF Files NetCDF files should have the relevant metadata included in the file itself. The metadata should include the following information: -* If there is a licence, it should be in a ``license`` global attribute as per +* The licence should be in a ``license`` global attribute as per `ESIP Attribute Convention for Data Discovery `_. * If there is a paper attribution requirement, the relevant paper(s) should be From dcbcfe8db11cacb5a832ef7e2d5abadee2af84e2 Mon Sep 17 00:00:00 2001 From: Yaswant Pradhan <2984440+yaswant@users.noreply.github.com> Date: Thu, 30 Apr 2026 15:02:29 +0100 Subject: [PATCH 10/10] Apply CR suggestion --- source/Development/testdata.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/source/Development/testdata.rst b/source/Development/testdata.rst index 089436dc..8d5fd577 100644 --- a/source/Development/testdata.rst +++ b/source/Development/testdata.rst @@ -65,8 +65,9 @@ If a source file has a licence that imposes requirements on derived works, then any ancillary file (or an intermediate file used to generate an ancillary) counts as a derived work for the purposes of recording metadata. -In cases where a file has been generated from multiple sources, it should be -made clear where each licence/attribution/acknowledgement has come from. +In cases where a file has been generated from multiple sources, the licences +must be compatible with each other and it should be made clear where each +licence/attribution/acknowledgement has come from. NetCDF Files ^^^^^^^^^^^^