-
Notifications
You must be signed in to change notification settings - Fork 15
Add attribution metadat policy for BIG_DATA #610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Yaswant Pradhan (yaswant)
wants to merge
13
commits into
MetOffice:main
Choose a base branch
from
yaswant:big-data-policy
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
00f8354
Add attribution metadat policy for BIG_DATA
yaswant 0fe8431
Merge branch 'main' into big-data-policy
yaswant f3061cc
Include process on adding test data
yaswant 2ecfbf4
Merge branch 'main' into big-data-policy
yaswant ed9a99a
Update source/Development/testdata.rst
yaswant 812d7d8
Update source/Development/testdata.rst
yaswant 3c29b5a
Update source/Development/testdata.rst
yaswant 15c6361
Update source/Development/testdata.rst
yaswant c824c7b
Update source/Development/testdata.rst
yaswant a4d2e66
Update source/Development/testdata.rst
yaswant 77a3638
Update source/Development/testdata.rst
yaswant dcbcfe8
Apply CR suggestion
yaswant 00bc7c1
Merge branch 'main' into big-data-policy
yaswant File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -54,6 +54,7 @@ carefully: | |
| kgo | ||
| diagnostics | ||
| rose_stem | ||
| testdata | ||
| testing | ||
|
|
||
| .. important:: | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,109 @@ | ||
| .. ----------------------------------------------------------------------------- | ||
| (c) Crown copyright Met Office. All rights reserved. | ||
| The file LICENCE, distributed with this code, contains details of the terms | ||
| under which the code may be used. | ||
| ----------------------------------------------------------------------------- | ||
|
|
||
| .. _testdata: | ||
|
|
||
| Adding Test Data | ||
| ================ | ||
|
|
||
| .. note:: | ||
|
|
||
| This page is a placeholder for information about test data. It is not yet | ||
| complete and will be updated in due course. | ||
|
|
||
| *The instructions here are Met Office specific, other sites may manage their | ||
| test data differently.* | ||
|
|
||
| .. important:: **Attribution Metadata Policy** | ||
|
|
||
| If the change requires a new or updated file in ``LFRIC_DATA_DIR`` then you | ||
| will need to work with the Information Asset Owner (IAO) to ensure that data | ||
| in ``LFRIC_DATA_DIR`` includes clear attribution and licence metadata. | ||
| Where possible, this should follow existing UM ``ANCILDIR`` conventions (`see | ||
| below <prerequisites-section_>`_), with ``.attribution`` and ``.license`` | ||
| files or equivalent NetCDF **global attributes** (at least, ``references``, | ||
| ``license``, ``source``, and ``history``). Attribution must reflect the | ||
| original data source and be provided by the data creators before deployment, | ||
| share, or distribution. | ||
|
|
||
| It is treated as an **Information Asset / licensing requirement**, not just | ||
| a best practice. | ||
|
|
||
|
|
||
| For UM related datasets, please Email the `MIAO team <mailto:miao@metoffice.gov.uk>`_ | ||
| to discuss the best way to share the data. | ||
|
|
||
| .. _prerequisites-section: | ||
|
|
||
| Prerequisites | ||
| ------------- | ||
|
|
||
| Before adding test data, you should have a good understanding of the change you | ||
| are making and the tests you will be adding. You should also have a good | ||
| understanding of the codebase and the testing framework you will be using. | ||
|
|
||
| Licenses | ||
| ~~~~~~~~ | ||
|
|
||
| All files require a licence and a record of where they have come from, both | ||
| for legal and auditing purposes. In your request please describe where and how the | ||
| data was generated, and the terms and conditions of its licence. | ||
|
|
||
| Before any files can be deployed, they must be approved by an IAO and this cannot be done | ||
| without information about the licencing terms. | ||
|
|
||
| Metadata | ||
| ~~~~~~~~ | ||
|
|
||
| All file requirements should be recorded in or alongside the files being | ||
| deployed. | ||
|
|
||
| If a source file has a licence that imposes requirements on derived | ||
| works, then any ancillary file (or an intermediate file used to generate an | ||
| ancillary) counts as a derived work for the purposes of recording metadata. | ||
|
|
||
| In cases where a file has been generated from multiple sources, the licences | ||
| must be compatible with each other and it should be made clear where each | ||
| licence/attribution/acknowledgement has come from. | ||
|
|
||
| NetCDF Files | ||
| ^^^^^^^^^^^^ | ||
|
|
||
| NetCDF files should have the relevant metadata included in the file itself. | ||
| The metadata should include the following information: | ||
|
|
||
| * The licence should be in a ``license`` global attribute as per | ||
| `ESIP Attribute Convention for Data Discovery <https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3#Recommended>`_. | ||
|
|
||
| * If there is a paper attribution requirement, the relevant paper(s) should be | ||
| cited in the ``references`` global attribute as per | ||
| `CF conventions <https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#description-of-file-contents/>`_. | ||
|
|
||
| * If there is an organisation attribution requirement, it should be in the | ||
| ``institution`` global attribute (again, as per CF). | ||
|
|
||
| * If there is any other attribution requirement (e.g. for an individual), it | ||
| should be in the ``acknowledgement`` global attribute (again, as per ACCD). | ||
|
|
||
| * If there are restrictions on usage (e.g. "research only"), these should be in | ||
| a ``restrictions`` global attribute. | ||
|
|
||
| Other Files | ||
| ^^^^^^^^^^^ | ||
|
|
||
| * Licence should be in an accompanying plain text file with the same name as the | ||
| data file, but with a ``.license`` suffix. | ||
|
|
||
| * Attribution should be in an accompanying plain text file with the same name as | ||
| the data file, but with a ``.attribution`` suffix. | ||
|
|
||
| * Restrictions on usage (e.g. "research only") should be in an accompanying | ||
| plain text file with the same name as the data file, but with a | ||
| ``.restrictions`` suffix. | ||
|
|
||
| If you have questions about the process or concerns about the provenance of the | ||
| data you want to include, please engage with the IAO as early as possible to | ||
| prevent delays to your change later on. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better not to imply that this is optional:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For LFRic, we prefer to include the metadata and license as NetCDF global attributes rather than storing them separately. This approach does not currently apply to UM ANCILDIR.
For non-NetCDF LFRic files, we follow UM ANCILDIR convention.