image/gif: gif decoding fails `with too much image data` #16146

odeke-em · 2016-06-22T07:15:39Z

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?
go version devel +1f44643 Wed Jun 22 00:12:55 2016 +0000 darwin/amd64
What operating system and processor architecture are you using (go env)?
GOARCH="amd64"
GOBIN="/Users/emmanuelodeke/go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/emmanuelodeke/go"
GORACE=""
GOROOT="/Users/emmanuelodeke/go/src/go.googlesource.com/go"
GOTOOLDIR="/Users/emmanuelodeke/go/src/go.googlesource.com/go/pkg/tool/darwin_amd64"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/kn/s30g1srd4h50bh6sd322tppm0000gn/T/go-build038095302=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
What did you do?
Run program at https://play.golang.org/p/NqqvAkiIao with source of http://ualberta.ca/~odeke/tx.gif

$ go run NqqvAkiIao.go --source http://ualberta.ca/~odeke/tx.gif

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

What did you expect to see?
Successful decoding of a gif with no errors
What did you see instead?
gif: too much image data

/cc @montanaflynn who reported this issue first.

Code inlined

/*
  After running this gif
      https://67.media.tumblr.com//b02659b27081314a412215f4eb31dacf/tumblr_nu2x4whJgy1udf6d3o1_400.gif
  through ffmpeg, Go fails to decode the gif with error:
      `gif: too much image data`

  Command:
  $ ffmpeg -i https://67.media.tumblr.com//b02659b27081314a412215f4eb31dacf/tumblr_nu2x4whJgy1udf6d3o1_400.gif outf.gif
  $ go run main.go --source outf.gif
  - Repro URL
  http://ualberta.ca/~odeke/tx.gif

  Interestingly:
  - The outfile is 1.0MB.
  - The original file is 1.4MB.
*/
package main

import (
    "flag"
    "image/gif"
    "log"
    "net/http"
    "path/filepath"
    "strings"
)

func localAndNetBasedClient() *http.Client {
    transport := new(http.Transport)
    transport.RegisterProtocol("file", http.NewFileTransport(http.Dir("/")))
    client := new(http.Client)
    client.Transport = transport
    return client
}

var httpPrefixes = []string{"http", "https"}

func ensureFitForFetch(source string) string {
    for _, prefix := range httpPrefixes {
        if strings.HasPrefix(source, prefix) {
            return source
        }
    }

    absSource, err := filepath.Abs(source)
    if err == nil {
        source = absSource
    }
    return strings.Join([]string{"file://", source}, "/")
}

func main() {
    var source string
    flag.StringVar(&source, "source",
        "https://67.media.tumblr.com//b02659b27081314a412215f4eb31dacf/tumblr_nu2x4whJgy1udf6d3o1_400.gif",
        "the URI of the gif you want to ensure properly decodes in Go")
    flag.Parse()

    client := localAndNetBasedClient()
    source = ensureFitForFetch(source)
    res, err := client.Get(source)
    if err != nil {
        log.Fatal(err)
    }

    defer res.Body.Close()

    if _, err := gif.DecodeAll(res.Body); err != nil {
        log.Fatal(err)
    }
}

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2016-06-22T13:35:47Z

CC @nigeltao

This happens with every Go release since 1.1, so it is not a new problem.

With 1.0.3 I get gif: extra data after image.

vsiv · 2017-02-14T10:03:19Z

Any updates on this, we process thousands of images daily, and run into this everyday, albeit on a very small number of them.

Thanks.

nigeltao · 2017-02-15T00:36:35Z

No update, but it would be helpful if you could attach a small (in terms of file size) GIF that exhibits this problem. The original post linked to something weighing 1 MB but that's still kind of unwieldy to debug. The smallest 'bad' GIF you've got would be great.

vsiv · 2017-02-15T07:23:07Z

Most of our images come from giphy.com where, a typical GIF is > 1MB, so I havent been able to find one thats smaller - will post if I find one.

There is one that's quite wonky - its smaller but fails with a different message -
http://a3.fanbread.com/uploads/listicle/featured_image/65227/extra_large_cropped_wino.gif
fails with gif: frame bounds larger than image bounds

montanaflynn · 2017-02-15T19:51:29Z

@nigeltao here is one that's under 500kb and less than 15 frames: https://j.gifs.com/Z6KoKJ.gif

It seems much more prominent on larger gifs though.

horgh · 2017-02-20T00:33:19Z

I've debugged this. I have an explanation and a recommendation.

The problem gifs in question (tx.gif from @odeke-em and Z6KoKJ.gif from @montanaflynn) indeed have extra/too much data. In both cases, the extra data is a data sub-block in the image data with size 1 containing 0x00. In both cases, there is an LZW End of Information Code, then this extra sub-block, and then the Block Terminator.

In tx.gif this happens in the second image in the gif. In Z6KoKJ.gif it happens in the 11th image in the gif.

In well formed gifs, the last sub-block would not be present. It does not need to be there since we completed decompression.

Currently the decoder fully reads/decompresses the image's data, and then uses the block reader to check whether there is more. It sees there is more data (in this case, there is a block of 1 byte found) and raises this error.

If we don't decompress the data and simply look at the image data as a series of data sub-blocks, it is not extra. That is, the data sub-blocks are well formed. It is extra only when we take into account reaching the LZW End of Information code.

The question is what we should be doing. Other decoders (GIFLIB, browsers) apparently accept this. As far as the specification goes, it does not say one way or the other. In Appendix F it says "the code that terminates the LZW compressed data must appear before Block Terminator". It does not state what to do if there is data after the LZW data ends, nor that the code must be the last piece of data. Section 22 states that "[t]he sequence of indices is encoded using the LZW Algorithm [...]" so I think this data should not be treated as an index. But it is not useful and presumably is technically invalid.

I looked at what the most recent version of GIFLIB (5.1.4) does. After it fully decompresses an image, it silently reads and discards any extra data such as this. It stops when it reads the Block Terminator. You can see this in its DGifGetLine() function (the do while loop). I think doing this would work for Go's decoder too. Once we read and decompress the image, any extra data in its sub-blocks can be discarded. There should not be any, but if there are, ignore them.

As for where these problem gifs are coming from, given the transformation using ffmpeg produced one with the defect, its gif encoder may have a bug!

Note I didn't analyze @vsiv's extra_large_cropped_wino.gif. It appears corrupted to me in Firefox and is showing a different error, so possibly it is either truly corrupted or a different issue.

I wrote a patch with a possible solution. I realize this will need to be submitted differently, but I thought I would include it and get feedback here before doing so. https://github.com/horgh/go/commit/ab144b8f336c0b99eec9b416a7577a305190dbdc

Thanks for any feedback!

cespare · 2017-02-20T00:43:32Z

Hi @horgh, nice debugging work. I suggest you submit a CL. We cannot review code without a signed CLA, so a Gerrit CL is the only format for discussing patches.

bradfitz · 2017-02-20T01:17:57Z

@horgh, great debugging.

@nigeltao, thoughts?

vsiv · 2017-02-20T04:49:09Z

@horgh - thanks for the debugging!
All - In an ideal world, if chrome/safari can handle a gif without user errors, golang accomodates it as well - but i understand this can be a slippery slope, because as u pointed out, some of these gif encoders have bugs in them, and many being used in existence are older versions of those libs, so even though the lib fixes the issue, new gif's appear on the web with the buggy encoding. This is similar to forgiving html authoring errors, that browsers accomodate.

Another option is to work out an external gif decoder lib here on github for concerning users to use if needed and leave the stricter lib as part of the golang distribution. thanks.

gopherbot · 2017-02-20T05:40:35Z

CL https://golang.org/cl/37258 mentions this issue.

horgh · 2017-02-20T05:41:19Z

Thanks all! I've uploaded the possible patch to Gerrit now too: https://go-review.googlesource.com/37258 (oh I see @gopherbot beat me!)

nightlyone · 2017-02-20T22:23:18Z

Could it be limited somehow, how much stray data is allowed? Otherwise it feels like a new attack vector. What do other gif libraries do to limit it?

horgh · 2017-02-21T00:33:12Z

Good point @nightlyone.

Other gif libraries:

GIFLIB has no limit on how much extra data it reads (see the function I referenced in my first comment)
ffmpeg's decoder also has no limit as best I can tell. See its gif_read_image() call to ff_lzw_decode_tail() which states "read the garbage data until end marker is found". The while loop in the latter function reads and skips bytes until it reaches a block terminator.

That is not to say we need to use that same approach here. In the two samples mentioned in this issue, only a single byte is extra. We could be conservative and allow as little as 1 byte or some arbitrary small limit.

nigeltao · 2017-02-23T09:01:34Z

@horgh great debugging indeed.

I'd like to limit the slack to at most one extra byte. We can continue the conversation on the code review.

gopherbot · 2017-10-05T00:38:43Z

Change https://golang.org/cl/68350 mentions this issue: image/gif: make blockReader a ByteReader, harden tests

golang.org/cl/37258 was committed to fix issue #16146. This patch seemed intent to allow at most one dangling byte. But, as implemented, many more bytes may actually slip through. This is because the LZW layer creates a bufio.Reader which will itself consume data beyond the end of the LZW stream, and this isn't accounted for anywhere. This change means to avoid the allocation of the bufio.Reader by making blockReader implement io.ByteReader. Further, it adds a close() method which detects extra data in the block sequence. To avoid any regressions with poorly encoded GIFs which may have worked accidentally, there are no restrictions on how many extra bytes may exist in the final full sub-block that contained LZW data. If the end of the LZW stream happened to align with the end of a sub-block, at most one more sub-block with a length of 1 byte may exist before the block terminator. This change aims to be at least as performant as the prior implementation. But the primary gain is avoiding the allocation of a bufio.Reader per frame: name old time/op new time/op delta Decode-8 276µs ± 0% 275µs ± 2% ~ (p=0.690 n=5+5) name old speed new speed delta Decode-8 55.9MB/s ± 0% 56.3MB/s ± 2% ~ (p=0.690 n=5+5) name old alloc/op new alloc/op delta Decode-8 49.2kB ± 0% 44.8kB ± 0% -9.10% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Decode-8 269 ± 0% 267 ± 0% -0.74% (p=0.008 n=5+5) Change-Id: Iec4f9b895561ad52266313fbc73ec82c070c3349 Reviewed-on: https://go-review.googlesource.com/68350 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Nigel Tao <nigeltao@golang.org>

ianlancetaylor added this to the Go1.8 milestone Jun 22, 2016

ianlancetaylor changed the title ~~image/gif: gif decoding fails with too much data~~ image/gif: gif decoding fails with too much image data Jun 22, 2016

nigeltao self-assigned this Jun 23, 2016

quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 7, 2016

rsc modified the milestones: Go1.9, Go1.8 Oct 26, 2016

gopherbot closed this as completed in 4775b7f Mar 2, 2017

vsiv mentioned this issue Mar 31, 2017

image/gif: decoding untrusted (very large) images can cause huge memory allocations #5050

Open

golang locked and limited conversation to collaborators Oct 5, 2018

gopherbot added the FrozenDueToAge label Oct 5, 2018

rsc unassigned nigeltao Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image/gif: gif decoding fails `with too much image data` #16146

image/gif: gif decoding fails `with too much image data` #16146

odeke-em commented Jun 22, 2016

ianlancetaylor commented Jun 22, 2016

vsiv commented Feb 14, 2017

nigeltao commented Feb 15, 2017

vsiv commented Feb 15, 2017 •

edited

Loading

montanaflynn commented Feb 15, 2017

horgh commented Feb 20, 2017

cespare commented Feb 20, 2017

bradfitz commented Feb 20, 2017

vsiv commented Feb 20, 2017

gopherbot commented Feb 20, 2017

horgh commented Feb 20, 2017

nightlyone commented Feb 20, 2017

horgh commented Feb 21, 2017

nigeltao commented Feb 23, 2017

gopherbot commented Oct 5, 2017

image/gif: gif decoding fails with too much image data #16146

image/gif: gif decoding fails with too much image data #16146

Comments

odeke-em commented Jun 22, 2016

ianlancetaylor commented Jun 22, 2016

vsiv commented Feb 14, 2017

nigeltao commented Feb 15, 2017

vsiv commented Feb 15, 2017 • edited Loading

montanaflynn commented Feb 15, 2017

horgh commented Feb 20, 2017

cespare commented Feb 20, 2017

bradfitz commented Feb 20, 2017

vsiv commented Feb 20, 2017

gopherbot commented Feb 20, 2017

horgh commented Feb 20, 2017

nightlyone commented Feb 20, 2017

horgh commented Feb 21, 2017

nigeltao commented Feb 23, 2017

gopherbot commented Oct 5, 2017

image/gif: gif decoding fails `with too much image data` #16146

image/gif: gif decoding fails `with too much image data` #16146

vsiv commented Feb 15, 2017 •

edited

Loading