-
Notifications
You must be signed in to change notification settings - Fork 18k
compress/flate: deflatefast does not deliver consistent results after reset #34121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
\cc @klauspost as this issue probably affects your fork as well. |
I have a change to the existing tests which will expose the bug, but I'm working on making it flexible instead of based on knowing the exact current implementation. Also getting a fix ready to suggest. |
Thanks for the ping! I am about to replace it with code that shifts down the table instead of clearing it. That should make it transparent. Let me see if I can cook up a PR with similar functionality. |
Make the reset point retain the current table so compression is unaffected by resets. Fixes golang#34121
PR added. |
Change https://golang.org/cl/193605 mentions this issue: |
My concern with shifting down the table is that it makes compression slower. If we had Reset clear the table entirely, then compression remains as fast as it does now. Reset doesn't need to be super fast IMO. However, I have no objection to this style of fix if it's what you prefer. |
Received a report of this problem from a Go 1.15 beta 1 tester. |
Not completely fixed. There is another. |
Change https://golang.org/cl/243140 mentions this issue: |
in compress/flate/deflatefast.go:
Omitting here the reasons why stableflate does what it does, and focusing just on the one bad effect:
deflatefast.go maintains a counter "cur" in the object. When cur advances past 1<<30, resetAll is called, which discards the current match table. Discarding the match table changes the behavior of further compression at that point, of course.
Normal reset of the object calls reset, which advances cur by 1<<15 (and then is one of the places that checks if it's now past 1<<30, but this is not important here).
If the same object is repeatedly used for compression runs, with a reset in between, then two different objects, with different histories, but both reset, will have different values for cur. This in turn means there is a different run distance before the match table is discarded, and therefore, potentially different compression behavior even with identical input.
O(32K) (1<<30 / 1<<15) resets is sufficient to produce the problem, which is not absurd for a high-volume server.
The text was updated successfully, but these errors were encountered: