-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add placeholder CMIP7 MIP tables, and CMIP7_CV.json (#762) #778
Conversation
Will the variable tables in this pull request eventually have the branded variable format? Will those changes be reflected in mip-cmor-tables? |
I'll be happy to edit the python code that creates these tables to be consistent with the table structure described in #762 if you point me to where it is. |
Aside from needing
|
@mauzey1 great, I'll pull some placeholder entries across so we're starting to build out a file for CMIP7/CMOR3.10 testing - expect an updated commit to drop later today |
regarding #778 (comment), do we really expect the structure of the coordinates, formulas, and CMIP_CV.json file to be different from CMIP6? I think only if we need to associate an approx_interval with each frequency, but not if that's hard-wired within CMOR. Everything else should be the same as CMIP6, right? There will certainly be changes in content, but we have to wait for that. |
@taylor13 I think you're right, the contents of The current version of this PR now includes the changes I believe are captured in #762 - might have some final tweaks, but these are almost ready to start working with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of changes needed as indicated. A couple may deserve some discussion.
@taylor13 ok this is where I am up to - this is getting very close. Feel free to nit on what I have in the file changed in this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts;
- I'm still a little uncomfortable with the removal of an allowed set of frequencies for each variable. I'm not saying we shouldn't do this, but it does mean that any variable can be published at any frequency. It does give modelling groups more freedom to publish data, but it could lead to inconsistencies in the output of different groups. This is an item that I think warrants discussion, if only briefly, by the WIP.
- I'd be a little cautious about completing the max and min entries (fine for testing purposes) as there may (a) be a resolution dependence in certain variables and (b) be a frequency dependence (range of hourly data >> that of monthly means)
Adding some extra nits, following @taylor13 replies
|
Another section that is needed in the CV is the
|
Hi Matt,
I agree that users will be able to write a wider variety of unrequested datasets than in CIMP6. In CMIP6, any variable found in any cmor table could be written even if that variable was not requested for a given experiment or a particular portion of an experiment. And I'm sure lots of unrequested data was written. Now a user will, in addition, be able to write data at unrequested frequencies. If we are concerned that unrequested data (either from unrequested experiments or unrequested frequencies) will clutter up the archive, then we could write a bit of software whereby a user (or the publisher) could interrogate the data request and check whether a variable is in the data request (given an experiment, a time-slice, a frequency, and a region). Then the that variable would be skipped. I'm not sure who might step up and volunteer to do that,
Definitely. Given our current time-line, I agree that at most we should impose limits only on variables that are constrained physically by a max or min value (e.g., area fractions and other fractions, positive definite quantities like certain vertical fluxes and measures of quantity like mass which have a lower limit of 0, etc.). Even then, some modelers will complain because in some models you can get things like slightly negative specific humidity (but values so small they don't affect anything). If someone has the time and energy, they could also think of safe limits for ok_max_mean_abs and ok_min_mean_abs which would trap units problems in some variables. For example, for land_area_fraction, if you set those limits at 100 and 2, you would trap someone trying to report this as a fraction and not a percentage, as requested. (This test would raise a false error in a regional model place over the ocean (with no land) where the true mean land fraction is 0, independent of unit, which is less the ok_min_mean_abs values). |
Additional tweaks - I believe these are all the outstanding ones - if not please markup any additional changes in the tweak
add
|
Just replying to @matthew-mizielinski's comment above. The role of CMOR is to produce CF and project-compliant data (of which there are now quite a few projects other than CMIP), irrespective of whether this data has been requested by a project that is using the software. For e.g. data that was not included in the CMIP6 Data Request/DR was routinely produced for specific experiment(s), and made available either locally, or through the local node ESGF publication in some cases to those users. The role of the ESGF publisher is to validate that any data that is being submitted to a project complies with project needs (global attributes, known variables, known frequencies, known activities/MIPs, known experiments required to populate the index matching search facets), and this could include a check whether certain data was requested by the CMIP7 project. While most CMIP contributors use CMOR all don't, which further suggests that the publisher (rather than CMOR) is the right place to do such project-compliance checks. Ultimately, only data that is of broad interest (most likely requested in the CMIP7 DR) and produced by many groups will get replicated, and so I am less worried about your concern. I've just finalized the "test" file formats, so will merge this PR, and we can continue this dialogue about compliance and how to enforce it elsewhere. This might be a good discussion to elevate to a WIP meeting agenda, rather squirreled away in a CMOR PR chatter |
Ref #762
@mauzey1 @taylor13 @matthew-mizielinski @wolfiex ping - first pass at placeholder template files