-
Notifications
You must be signed in to change notification settings - Fork 52
Iteration of sea-ice observational time series appears incorrect #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@milenaveneziani, this bug was probably introduced by fe893b6. I'll put it on my list to look into in a couple of weeks when I have time. If it's urgent, take a look at what that commit did and see if you can figure out how I might have got the length of the repeated cycle wrong. |
@milenaveneziani, I would prefer we have a bug-fix PR and then an enhancement PR. This helps keep things clean and easier to review. Is there an advantage to having a single PR that I'm missing? |
no, no worries, I am fine with 2 PR's. I was just being lazy :) |
Sorry, this has been on my list for the last week but I haven't got to it yet. I can try to fix it tonight. I suspect the problem might be that shiftT designed to actually be slightly longer than the period of the cycle:
|
ok, I will try that. One other solution that I was thinking to try is to not shift the dsshift Time at all, but simply equal it to ds.Time. I tried it and the two Times are obviously the same, but I am getting a different error at plotting time, which I was about to investigate. |
If you're working on it, I'll leave it to you. If you don't come up with a fix, let me know and I'll give it another go. |
OK, I have not been very successful, but at least now I see where the problem arises. |
Also, the branch I am using to work on this is fixReplicateCycle, in case it helps. |
@milenaveneziani, I'm using a beta0 run (20161117.beta0.A_WCYCL1850.ne30_oEC.edison) with:
and I'm not seeing this problem. When i print the dates, I see:
First, it's weird that I see ds_toreplicte starting a year earlier than you do, but maybe your To debug further, we would need to track through mpas_xarray to see what time format it thinks it's getting and why it's doing the time conversion incorrectly. Could you send me a gist (or whatever link is convenient) containing your config file? That would let me debug with exactly the same run as you. |
@xylar, first of all, here is my config file: and PS: how do you tell the script to print times in days rather than nanoseconds? |
You're right about 9 extra days, that's odd. Maybe leap years? I thought
the default was to ignore them but maybe not.
To print days, just cast to float and divide by 24.*3600.*1e9.
…On Jan 21, 2017 11:39 PM, "Milena Veneziani" ***@***.***> wrote:
@xylar <https://github.com/xylar>, first of all, here is my config file:
https://gist.github.com/milenaveneziani/7504699434c352a99b63bfacd4d1b3cb
As you can see, it now uses year_offset=1849, but indeed I was using 1850
before, so that is explained.
I have also tried the same years as yours (41 and 50) and I too get both
times ending in Z:
ds.Time.min = 1890-01-14T04:58:13.242529000Z
ds_toreplicate.Time.min = 1850-01-15T00:00:00.000000000Z
ds.Time.min - ds_toreplicate.Time.min = 1262235493242529000 nanoseconds
So it seems that we get different formats depending on what year we use,
which is really strange.. (before I was using 200 and 203).
One final comment is that, even though the formats are the same for your
choice of years, the difference is still 40 years and 9 days, which is
puzzling.
Is there something strange happening with the xarray dates?
and PS: how do you tell the script to print times in days rather than
nanoseconds?
thanks.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#69 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AD_EeGgRzi0b0nxnrS5Ve-0vUQ6FPMqaks5rUokZgaJpZM4Lc1KV>
.
|
ah, that's a good point about leap years! Model years do not have leap years. Maybe python/xarray default calendar is gregorian? There should be a way to define the calendar, right? and thanks about the tip to print days (I actually meant 'years'): I was missing the part to convert to float so I was just getting the integer year value. |
@milenaveneziani, looking into this further, it does appear that xarray currently only supports the standard Gregorian calendar (with leap years): An alternative would be to convert to Anyway, this doesn't seem like it'll be a super simple fix... |
Reading a bit more, I don't think forcing xarray to read times as I'll give this a try tomorrow. |
@xylar, can you please clarify your understanding that xarray only supports Gregorian (leap) calendars? I took a quick look at pydata/xarray#1084, but based on my quick read of http://xarray.pydata.org/en/stable/io.html?highlight=gregorian#time-units it appears that other calendars should be supported. @shoyer, is it true xarray only supports Gregorian (leap) calendars currently? If this is the case we need to edit the docs for xarray so that this is clearer. It appears this is the case because we are currently locked into Any recommendation or insights you can provide @shoyer on non-Gregorian calendar options in xarray would be really helpful. |
I have not used non-Gregorian calendars personally, but I'm pretty sure they should work at least in a minimal fashion (e.g., for loading data, but not for dedicated datetime routines). Certainly @jhamman put in a lot of work to ensure that we fall back to loading dates at netcdftime for non-standard calendars. The result should be an array with dtype=object filled with netcdftime.datetime instances, rather than dtype=datetime64[ns]. |
@pwolfram, I could be wrong but I think a lot of things we do with xarray won't work correctly with In any case, our times are being read presently as if they are Gregorian even though they are, in fact, on a 365-day calendar, presumably because MPAS output is not CF compliant and no calendar is specified. |
Someone recently added arithmetic support to |
@shoyer, great to know. I just started looking into @milenaveneziani and @pwolfram, it seems like we should make a new issue for looking into switching to |
@xylar, @pwolfram: I am looking through all of our routines where we use/process times, and I don't see a place when we actually make use of numpy.datetime. We use datetime on its own a lot, and panda.to_timedelta, and I think that's it. |
@milenaveneziani, here is one place we handle Times from xarray datasets assuming they might be You are right that we often convert to In the present case ( |
@xylar, since we are basically just extending a yearly climatology could we try a different approach for |
@pwolfram: I worry that, if we do not clearly specify the calendar we use in the model, we may run into other similar issues in the future. |
@pwolfram, I agree that we could probably come up with an alternative way of handling Part of the reason for the problem is that I think maybe this could be handled here: I'll try this out on a branch and let you know how it goes. |
My latest findings from this evening's research:
So I will keep working toward a test branch to see if this will be a workable solution for us. It seems very promising. But I think, at least for a little while, it may be a big pain because we won't be able to get the library versions we need with conda. |
@xylar: I guess I was looking at something different, ie the num2date and date2num functions within netCDF4 (http://unidata.github.io/netcdf4-python/#section7). |
@milenaveneziani, for purposes of solving this specific issue, I agree that So I think switching over to |
Based on taking another look at this thread @milenaveneziani and @xylar, I think the best solution is to use cc @shoyer for additional comment if needed. |
I don't know if this helps much, but it's pretty straightforward to inject attributes (like
|
@shoyer, that's very helpful, indeed! @pwolfram, I think this means that all our existing calls to |
Agreed @xylar, we would just need to set this up for xarray to decode. I still think it may be useful to add some testing code into xarray to make sure that the functionality we are leveraging remains unbroken for future versions. @xylar and @milenaveneziani, if you can point me to an example I can use to reproduce the error in this issue I can probably make the needed changes for the fix. |
@pwolfram, you can use any run that is long enough to show the shift clearly. The beta0 simulation is more than 250 years long, so you could choose that and timeseries_y1=200, timeseries_yr2=205. |
@pwolfram, I'm working on this in a branch, so far without luck. Here are my findings today:
I'll keep working on this, but it's not looking terribly easy so far. |
@pwolfram, are you interested in fixing |
Looking at the |
@xylar, I'm assuming were you able to get |
Note, this obviously implies we should be testing some dataset math following our custom |
@pwolfram asked:
I will comment more in #82. The basic answer is that I was not able to get the expected types from |
Yes, I agree. I will make this a priority for inclusion in #82. |
Let me jump in here and point out a few things...
In [28]: import xarray as xr
In [29]: import netCDF4 as nc4
In [30]: ds = xr.tutorial.load_dataset('rasm') # this dataset uses a noleap calenar
In [31]: ds.time.dtype
Out[31]: dtype('<M8[ns]')
In [32]: ds = xr.tutorial.load_dataset('rasm', decode_cf=False)
In [33]: ds.time.dtype
Out[33]: dtype('float64')
In [34]: ds['time'] = nc4.num2date(ds.time, ds.time.attrs['units'], ds.time.attrs['calendar'])
In [35]: ds.time.dtype
Out[35]: dtype('O')
In [36]: ds.time
Out[36]:
<xarray.DataArray 'time' (time: 36)>
array([netcdftime._netcdftime.DatetimeNoLeap(1980, 9, 16, 12, 0, 0, 0, 0, 259),
netcdftime._netcdftime.DatetimeNoLeap(1980, 10, 17, 0, 0, 0, 0, 3, 290),
... By doing this, you'll loose some of the nice parts of using the numpy datetimes but your math will work. Hopefully, some of this is useful to you all. We keep seeing issues related to calendars pop up and we're working to fix those. Of course, if anyone is keen to help work on the larger problems, there's plenty to do! |
Thanks @jhamman! This is very helpful. I now understand why what I tried didn't work as I expected. Your suggested solution is essentially what I have implemented. We have the additional complication that we sometimes (typically, actually) have a date as a string, not as a float64. It is easy for us to parse out years, months, days, hours, minutes, seconds. But it doesn't seem like NetCDFDatetimes = {
'standard': datetime.datetime,
'gregorian': datetime.datetime,
'proleptic_gregorian': netcdftime.DatetimeProlepticGregorian,
'noleap': netcdftime.DatetimeNoLeap,
'julian': netcdftime.DatetimeJulian,
'all_leap': netcdftime.DatetimeAllLeap,
'365_day': netcdftime.DatetimeNoLeap,
'366_day': netcdftime.DatetimeAllLeap,
'360_day': netcdftime.Datetime360Day}
NetCDFDatetime = NetCDFDatetimes[calendar]
...
date = NetCDFDatetime(year=year, month=month, day=day, hour=hour, minute=minute, second=second) Is there a cleverer way to do this? The only option I've come up with is making a date = datetime.datetime(year=year, month=month, day=day, hour=hour, minute=minute, second=second)
nc4.num2date(nc4.date2num(date, units, calendar)) My main beef with that approach is the unnecessary rounding and extra math that has to happen, and also that it involves cc @pwolfram |
Issue addressed by #111. Problem will completely go away when we adopt the newer MPAS-Seaice timeSeriesStats am. |
When plotting sea-ice aggregate area and volume time series, we also plot the climatological cycle of observational values. This cycle is repeated to cover the full extent of the model time series.
Something must be slightly off when setting the repeated cycle for the observational data, as the model and obs time series appear shifted by quite a bit after many cycles. For example, here is a plot of SH sea-ice area after about 200 years of one of the ACME beta0 simulations:
The text was updated successfully, but these errors were encountered: