-
Notifications
You must be signed in to change notification settings - Fork 44
Add preliminary CMIP7 support #2935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Allow use of branding_suffix in other projects than CMIP7 to select the right variable from the CMOR table
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2935 +/- ##
==========================================
- Coverage 95.59% 95.59% -0.01%
==========================================
Files 266 266
Lines 15573 15593 +20
==========================================
+ Hits 14887 14906 +19
- Misses 686 687 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
use official drs update docstrings
| "rcm_version", | ||
| "driver", | ||
| "domain", | ||
| "activity", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The activity is uniquely determined by the exp facet, so this only adds noise and no new information. To keep it readable, I would propose to remove this.
schlunma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bouwe, looks great! Just got a couple of minor questions/comments.
I tested this with the fake CMIP7 data, everything worked as expected 🚀
doc/quickstart/configure.rst
Outdated
| It can be useful to automatically add extra key-value pairs to variables or | ||
| datasets without explicitly specifying them in the recipe. | ||
| It can be useful to automatically add extra key-value pairs or | ||
| :ref:`facets <facets>` to variables or datasets without explicitly specifying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Key-value pairs" and "facets" mean the same thing here, right? In that case it would probably be easier to just call them "facets" here. A more detailed description (including the mention of "key-value pairs") is given in the link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 5e13989
doc/quickstart/configure.rst
Outdated
| It can be useful to automatically add extra key-value pairs or | ||
| :ref:`facets <facets>` to variables or datasets without explicitly specifying | ||
| them in the recipe. | ||
| These key-value pairs can be used for :ref:`finding data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| These key-value pairs can be used for :ref:`finding data | |
| These facets can be used for :ref:`finding data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 5e13989
doc/recipe/overview.rst
Outdated
|
|
||
| The ``datasets`` section includes dictionaries that, via key-value pairs or | ||
| "facets", define standardized data specifications: | ||
| :ref:`facets <facets>`, define standardized data specifications: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also just call them "facets" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5e13989
| This is always the case for CMIP7, where the branded name of the variable is | ||
| used, which is composed of the ``short_name`` followed | ||
| by an underscore and the ``branding_suffix``. For example, the facets | ||
| ``project: CMIP7, mip: atmos, short_name: tas, branding_suffix: tavg-h2m-hxy-u`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to briefly describe the different elements of the branding suffix or at least provide a link where this is described?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a link in 5e13989
| To make it easy to compare and combine data from different climate models, | ||
| reanalysis datasets, and observational datasets, ESMValCore uses the standardized | ||
| variables from the | ||
| `CMOR tables <https://github.com/ESMValGroup/ESMValCore/tree/main/esmvalcore/cmor/tables>`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to the "official" CMOR tables here? A link to our copy is provided below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The official CMOR tables aren't kept in one place, so I can't link them in the same way. Would you prefer it if I removed the link? Or some other solution?
doc/reference/facets.rst
Outdated
| +------------------+-----------------------+ | ||
| | ESMValCore facet | ESGF facet | | ||
| +==================+=======================+ | ||
| | activity | activity_id | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to use monospace font for all the facets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5e13989
| def get_variable( | ||
| self, | ||
| table: str, # noqa: ARG002 | ||
| table_name: str, # noqa: ARG002 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, this is also a backwards-incompatible change, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I changed it so the arguments to this method match with the method of the same name on the parent class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this to the pull request description
| CMIP7: | ||
| data: | ||
| local: | ||
| type: "esmvalcore.io.local.LocalDataSource" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| type: "esmvalcore.io.local.LocalDataSource" | |
| type: esmvalcore.io.local.LocalDataSource |
I don't think it's necessary to use quotation marks here. Applies to all appearances of this, also in other config files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in 5e13989
|
|
||
| start_date, end_date = _parse_period(timerange) | ||
| start, end = _get_start_end_date(filename) | ||
| start, end = filename.facets["timerange"].split("/") # type: ignore[union-attr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the timerange facet point to the actual time range of a single file? As far as I am aware, this points to the requested time range given in the recipe (but maybe that changed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
esmvalcore/config-developer.yml
Outdated
| cmor_default_table_prefix: "CMIP7_" | ||
| cmor_path: "cmip7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cmor_default_table_prefix: "CMIP7_" | |
| cmor_path: "cmip7" |
I guess those are technically not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the cmor_default_table_prefix appears to be unessary. cmor_path defaults to a lower case version of cmor_type:
ESMValCore/esmvalcore/cmor/table.py
Lines 201 to 202 in f0b9561
| default_path = os.path.join(install_dir, "tables", cmor_type.lower()) | |
| table_path = project.get("cmor_path", default_path) |
so that is needed because not specifying that would load the CMIP6 CMOR tables instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed cmor_default_prefix in 5e13989
|
Ah, one more thing: I think this actually properly closes #333, right? |
|
I see Manu has picked up my slack and is reviewing - many thanks, Manu! And I can have last looksee when Bouwe and Manu are happy (phew!) 😁 |
|
Thank you for reviewing @schlunma! I've addressed all your comments except #2935 (comment) because I wasn't sure what to do there. |
Description
Makes a start with #2874. Also closes #333. Support for downloading from ESGF has not been implemented yet as no CMIP7 data has been published to ESGF yet. The first CMIP7 data is expected sometime in the next few months.
Works with examples/recipe_python.yml and
tasandareacelladata created with Simple_recmorise_cmip6-cmip7.ipynb after some updates to that notebook: Simple_recmorise_cmip6-cmip7.ipynb. Data generated with the notebook is available on Levante in/work/bd0854/b381141/simulated_cmip7_data.Example
datasetsrecipe entry:A copy of
examples/recipe_python.ymlwith the fake CMIP7 dataset generated by the notebook is available under Details.Details
activityandinstituteare automatically added from theCMIP6_CV.jsonfile for CMIP6, but for CMIP7 this file does not appear to be available yet. I found ESGF/esgf-vocab#150 (comment) describing some initial work to generate it from the controlled vocabulary.Link to documentation:
Backward incompatible change
Most users will not be affected by these changes, which were introduced to keep the some function signatures easy to read.
deriveas a positional argument to theesmvalcore.cmor.table.CMIP6Info.get_variablemethod. Please usederive=Trueorderive=Falseinstead ofTrueorFalserespectively.frequencyandcheck_levelare now keyword only arguments for the functionsesmvalcore.cmor.check.cmor_check_metadata,esmvalcore.cmor.check.cmor_check_data, andesmvalcore.cmor.check.cmor_check.tableto the methodesmvalcore.cmor.table.CustomInfo.get_variablehas been renamed totable_nameso the signature of this method matches with the same method on the parent classesmvalcore.cmor.table.InfoBase.get_variable.Before you get started
Checklist
It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.
Changes are backward compatibleTo help with the number pull requests: