-
Notifications
You must be signed in to change notification settings - Fork 180
TDS not properly updating GRIB collections with directory time partitions #857
Comments
Some notes:
in Then, thread locked, state updated. I think
|
Looks like this one didn't make the cut for v4.6.11. Any hope of a fix in the next release? |
Hi @pmspire - I've been digging around with this quite a bit, and while I have not found a singular cause, I have managed to make things work on my end (only through config changes). I had to do two things:
<featureCollection featureType="GRIB2"
harvest="true"
name="Test Paul Setup"
path="grib/tps">
<collection name="tps"
spec="/Users/sarms/dev/unidata/content/data/grib_collection_46/gfs_0p25/**/.*\.f[0-9][0-9][0-9]$"
dateFormatMark="yyyyMMdd'/'HH#/gfs\.#"
timePartition="directory"
olderThan="15 seconds" />
<update startup="never" trigger="allow" />
<tdm rewrite="test" rescan="10 * * * * ? *" />
</featureCollection> I think I had said to remove the update element, and while the values for startup and trigger should be the default, I don't think it's being picked up correct for directory partitions (still investigating this). So, to be safe, go ahead and include
My directory structure looks like this: 20170527/00:
gfs.t00z.pgrb2.0p25.f000
gfs.t00z.pgrb2.0p25.f003
20170527/06:
gfs.t06z.pgrb2.0p25.f000
gfs.t06z.pgrb2.0p25.f003
20170527/12:
gfs.t12z.pgrb2.0p25.f000
20170528/00:
gfs.t00z.pgrb2.0p25.f000
gfs.t00z.pgrb2.0p25.f003 Originally, I used a spec of: spec="/Users/sarms/dev/unidata/content/data/grib_collection_46/gfs_0p25/**/.*pgrb2.*" which is what you had in the example you sent me via eSupport (or close to it), but |
Hi Sean. Thank you for following up on this. I can confirm that, after implementing the configuration changes you suggested, our TDS is properly updating after receiving a TDM trigger. This is a big help to us, and we're grateful for your help. |
I apologize that I took so long debugging when it was just a config issue 👎 I'm glad it is up and running for you now! |
Since none of us figured out before now that it was a matter of configuration, I suppose you might be able to make a case that there ought to be better error reporting somewhere, since all the signs we got were that things should be working. But I'll leave that to you. Thanks again. |
Sean, I'm sorry, I think I spoke too soon. I'm still having trouble. It had been a while since I looked at this, and so had forgotten the actual bad behavior: The first time I request the catalog (I'm using the web interface), I do see all the indexed files. However, when more files become available, and TDM indexes them and sends a trigger to TDS, a new catalog request does not show any change. I still have to reload Tomcat to see the newly indexed files. I'd forgotten that that was the failure mode. Now, I'm not quite sure I'm doing the right thing with
Does it really work spanning path elements? Also, the bit about "The number of characters before the # is skipped" seems problematic: In your example, For example, here's a block from my
(I used the regex form This is meant to match files like
But the need to restart Tomcat persists. Can you see a problem with the above vs your recommended changes? Apologies if I missed something. Can you verify that, for you, TDS actually serves an updated catalog if you have already requested one but new data have subsequently become available and been indexed? |
I'll take a stab at interpreting your Also, just to refresh my memory and be sure, I ssh'ed to the system running Tomcat and queried the URL I think my last question above is the crucial one: If you can have some files indexed, then request the catalog, then have some more files indexed and actually receive an updated catalog, then there's still hope in a configuration-based fix and I just have to figure out what I'm doing wrong. |
A colleague pointed me to better docs on I'm still testing but haven't yet been able to overcome the TDS update problem. |
I just did an experiment using
I TimeCoverage: Then I
But, querying TDS again, the TimeCoverage did not change. When I restarted Tomcat, though: TimeCoverage: which is correct. It seems to me that this isn't related to the directory-partitioning scheme. FWIW, the entry in
It seems like TDS is rejecting the update, or there's something wrong with the update. I tried a number of manual tests using curl, using the various update values in the "update element" section here:
with the corresponding log entries in
That last one looked promising but, alas, the reported TimeCoverage shows no change. But after restarting Tomcat, I see the correctly updated TimeCoverage. |
HI @pmspire - it looks like it does the right thing, but only sometimes. If I add a new date/time directory with a new file, it shows up when I refresh the catalog. If I add a new file to an existing directory, it does not show up unless I restart the TDS. I've checked out the index files, and everything is being added to those index files correctly, so all of the information about the new files is there. However, it does not look like the TDS recognizes the updated directory index file, and new files added to an existing directory are not exposed. We're getting close! Thanks for working with me on this! |
Thanks -- I'm ready to do any testing that would be helpful! |
So the issue was that the grib collection index files were being cached when users visited a catalog, but that cache was not getting updated when the index files changed, so old collection objects stored in the cache were being used to construct the catalogs. I've made a PR that clears out the cache when an update has been detected, and this fixes the problem. I'll let you know when there is an artifact ready to test on your end. Although the fix is pretty straight forward, finding this issue was a beast. |
Nice work, thank you! |
Hi Sean -- I saw that the PR was merged. I would be happy to tests an artifact when one is available, as you suggested. I'm still subscribed to this ticket, so if you post something here, I can start testing then. Thanks again. |
Reported in e-support ticket VHE-971307.
For a grib collection that updates using the TDM, the TDS will not update the catalog listing for directory time partitions. The TDM creates new indices for new files, and successfully sends the TDS an update trigger. The TDS receives the trigger and goes through the motions of updating the collection, but the catalog does not reflect the new data. If the TDS is restarted, the catalog reflects the new additions. Therefore, I believe the problem is isolated to the collection update code in the TDS and not the TDM.
Likely the same issue as report by #246
The text was updated successfully, but these errors were encountered: