-
-
Notifications
You must be signed in to change notification settings - Fork 329
Permit use of codec registry with N5Store #577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ping @jakirkham @funkey, is there anything more that needs to be done here? |
Hi @d-v-b. I brought this up on the community call Wednesday. I'll try to summarize the discussion, but @perlman and @jakirkham should feel to correct me. Thoughts in order from least work but also least value rising, to highest:
~J. |
Why should the n5 compressor logic restrict the user's options to guard against surprises, while the zarr compressor logic does not? My lossy jpeg compressor (https://github.com/d-v-b/zarr-jpeg) that inspired this PR seems just as likely to cause "surprises" whether I use the zarr or n5 format, but the vanilla zarr backend doesn't complain about it, while the n5 backend errors. So I don't understand why there should be an asymmetry in the restrictiveness in the compressor API between these two options. |
Hi Davis. Sounds like I may just be clueless. I agree with you that the behavior of the N5Store should be as close to the base zarr implementations as possible (and if we need to add API or fix an implementation, that we do that in sync) My assumption was that zarr-python would throw an exception on running into a codec that it didn't know. (A quick test of modifying a All the being said, don't get me wrong: I find "This repo is inspired by the neuroglancer "precomputed" format, which uses jpeg encoding to compress chunks of imaging data." very exciting! (link) |
Yes. @jakirkham can shed more light on the details, but I think zarr-python is looking for compressors in the Line 182 in 847e78a
The codec lookup occurs outside of n5.py, so the only thing my PR is doing (as far as I can tell) is enabling the full codec registry for n5 chunks, rather than a hand-picked list of codecs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha.
Behavior without your PR:
RuntimeError: Unknown compressor with id xblosc
Behavior with your PR:
ValueError: codec not available: 'xblosc'
So I agree this synchronizes zarr.n5.N5Store
behavior with other stores for unknown codecs (:+1:) and I'm assuming then that installing zarr_jpeg
will allow files with that codec to be read (untested).
Thanks! And I realize in retrospect that the title of the this PR was a source of confusion -- I should not have used the term "arbitrary compressors", but instead I should have referenced the codec registry. Sorry for being unclear. |
We are interested in using lossy jpeg compression for 2D / 3D uint8 data. A java implementation of this exists for n5: https://github.com/saalfeldlab/n5-jpeg, but zarr cannot decode these chunks because
jpeg
is not in the list of hard-coded compressors inzarr/n5.py
. Because this jpeg compression is specialized to the point of being a hack, it doesn't make sense to add it to the list of hard-coded compressors. Instead, this PR enables reading/writing anN5Store
with an arbitrary compressor.TODO: