Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aiohttp does not skip response body when a HEAD request response has a body when using C extensions #10322

Open
1 task done
jonathon-love opened this issue Jan 13, 2025 · 20 comments
Labels

Comments

@jonathon-love
Copy link

Describe the bug

aiohttp barfs on a dropbox download, that other software (i.e. curl) doesn't seem to have difficulty with.

To Reproduce

import aiohttp
import asyncio

async def download_file(url):
    async with aiohttp.ClientSession() as session:
        async with session.head(url, allow_redirects=True) as resp:
            pass

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))

Expected behavior

200 OK

Logs/tracebacks

Traceback (most recent call last):
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client_proto.py", line 263, in data_received
    messages, upgraded, tail = self._parser.feed_data(data)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "aiohttp/_http_parser.pyx", line 558, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 1059, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/streams.py", line 671, in read
    await self._waiter
aiohttp.http_exceptions.HttpProcessingError: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "<stdin>", line 3, in download_file
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client.py", line 730, in _request
    await resp.start(conn)
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 1061, in start
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 400, message="Invalid character in chunk size:\n\n  b'\\x1f\\x8b\\x08'\n    ^", url='https://uc7f6f4bf01d26ef6c48853d3c4c.dl.dropboxusercontent.com/cd/0/get/CiHGZDofY40BymZ9BQFqKYP2T_NQroYSU7UaqEPQtQDkAYFB_GCSpSAvxX-j1cFVIVHx0qSE5QFz-cJ5cNrnweRXgVIyXn04tg3k_asEek_G5zxbLThKClANeLOqWVCZKB5o3tm7u0TEiWx3aiwqB3-A/file?dl=1'


### Python Version

```console
$ python --version

`Python 3.12.3`

aiohttp Version

$ python -m pip show aiohttp

`Version: 3.11.11`

multidict Version

$ python -m pip show multidict

Version: 6.1.0

propcache Version

$ python -m pip show propcache

0.2.1

yarl Version

$ python -m pip show yarl

1.18.3

OS

linux, macOS

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@webknjaz
Copy link
Member

Is there a redirect involved? Might be due to recoding... The traceback shows the server responding with HTTP 400.

@jonathon-love
Copy link
Author

yes redirects in play

% curl -I -L -v "https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1"
* Host www.dropbox.com:443 was resolved.
* IPv6: (none)
* IPv4: 162.125.83.18
*   Trying 162.125.83.18:443...
* Connected to www.dropbox.com (162.125.83.18) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=*.dropbox.com
*  start date: Nov 12 00:00:00 2024 GMT
*  expire date: Dec  8 23:59:59 2025 GMT
*  subjectAltName: host "www.dropbox.com" matched cert's "*.dropbox.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1
* [HTTP/2] [1] [:method: HEAD]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: www.dropbox.com]
* [HTTP/2] [1] [:path: /scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> HEAD /scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1 HTTP/2
> Host: www.dropbox.com
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 302 
HTTP/2 302 
< content-security-policy: script-src 'unsafe-eval' 'inline-speculation-rules' https://www.dropbox.com/static/api/ https://www.dropbox.com/pithos/* https://www.dropbox.com/page_success/ https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ https://accounts.google.com/gsi/client https://canny.io/sdk.js https://www.paypal.com/sdk/js https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ 'unsafe-inline' ; font-src https://* data: ; worker-src https://www.dropbox.com/static/serviceworker/ https://www.dropbox.com/encrypted_folder_download/service_worker.js https://www.dropbox.com/service_worker.js blob: ; report-uri https://www.dropbox.com/csp_log?policy_name=metaserver-whitelist ; child-src https://www.dropbox.com/static/serviceworker/ blob: ; frame-ancestors 'self' https://*.dropbox.com ; form-action https://docs.google.com/document/fsip/ https://docs.google.com/spreadsheets/fsip/ https://docs.google.com/presentation/fsip/ https://docs.sandbox.google.com/document/fsip/ https://docs.sandbox.google.com/spreadsheets/fsip/ https://docs.sandbox.google.com/presentation/fsip/ https://*.purple.officeapps.live-int.com https://officeapps-df.live.com https://*.officeapps-df.live.com https://officeapps.live.com https://*.officeapps.live.com https://paper.dropbox.com/cloud-docs/edit 'self' https://www.dropbox.com/ https://dl-web.dropbox.com/ https://photos.dropbox.com/ https://paper.dropbox.com/ https://showcase.dropbox.com/ https://www.hellofax.com/ https://app.hellofax.com/ https://www.hellosign.com/ https://app.hellosign.com/ https://docsend.com/ https://www.docsend.com/ https://help.dropbox.com/ https://navi.dropbox.jp/ https://a.sprig.com/ https://selfguidedlearning.dropboxbusiness.com/ https://instructorledlearning.dropboxbusiness.com/ https://sales.dropboxbusiness.com/ https://accounts.google.com/ https://api.login.yahoo.com/ https://login.yahoo.com/ https://experience.dropbox.com/ https://pal-test.adyen.com https://2e83413d8036243b-Dropbox-pal-live.adyenpayments.com/ https://onedrive.live.com/picker ; frame-src https://* carousel: dbapi-6: dbapi-7: dbapi-8: dropbox-client: itms-apps: itms-appss: ; style-src https://* 'unsafe-inline' 'unsafe-eval' ; default-src https://www.dropbox.com/playlist/ https://www.dropbox.com/v/s/playlist/ https://*.dropboxusercontent.com/p/hls_master_playlist/ https://*.dropboxusercontent.com/p/hls_playlist/ ; img-src https://* data: blob: ; connect-src https://* ws://127.0.0.1:*/ws blob: wss://dsimports.dropbox.com/ ; object-src 'self' https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ ; base-uri 'self' ; media-src https://* blob:
content-security-policy: script-src 'unsafe-eval' 'inline-speculation-rules' https://www.dropbox.com/static/api/ https://www.dropbox.com/pithos/* https://www.dropbox.com/page_success/ https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ https://accounts.google.com/gsi/client https://canny.io/sdk.js https://www.paypal.com/sdk/js https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ 'unsafe-inline' ; font-src https://* data: ; worker-src https://www.dropbox.com/static/serviceworker/ https://www.dropbox.com/encrypted_folder_download/service_worker.js https://www.dropbox.com/service_worker.js blob: ; report-uri https://www.dropbox.com/csp_log?policy_name=metaserver-whitelist ; child-src https://www.dropbox.com/static/serviceworker/ blob: ; frame-ancestors 'self' https://*.dropbox.com ; form-action https://docs.google.com/document/fsip/ https://docs.google.com/spreadsheets/fsip/ https://docs.google.com/presentation/fsip/ https://docs.sandbox.google.com/document/fsip/ https://docs.sandbox.google.com/spreadsheets/fsip/ https://docs.sandbox.google.com/presentation/fsip/ https://*.purple.officeapps.live-int.com https://officeapps-df.live.com https://*.officeapps-df.live.com https://officeapps.live.com https://*.officeapps.live.com https://paper.dropbox.com/cloud-docs/edit 'self' https://www.dropbox.com/ https://dl-web.dropbox.com/ https://photos.dropbox.com/ https://paper.dropbox.com/ https://showcase.dropbox.com/ https://www.hellofax.com/ https://app.hellofax.com/ https://www.hellosign.com/ https://app.hellosign.com/ https://docsend.com/ https://www.docsend.com/ https://help.dropbox.com/ https://navi.dropbox.jp/ https://a.sprig.com/ https://selfguidedlearning.dropboxbusiness.com/ https://instructorledlearning.dropboxbusiness.com/ https://sales.dropboxbusiness.com/ https://accounts.google.com/ https://api.login.yahoo.com/ https://login.yahoo.com/ https://experience.dropbox.com/ https://pal-test.adyen.com https://2e83413d8036243b-Dropbox-pal-live.adyenpayments.com/ https://onedrive.live.com/picker ; frame-src https://* carousel: dbapi-6: dbapi-7: dbapi-8: dropbox-client: itms-apps: itms-appss: ; style-src https://* 'unsafe-inline' 'unsafe-eval' ; default-src https://www.dropbox.com/playlist/ https://www.dropbox.com/v/s/playlist/ https://*.dropboxusercontent.com/p/hls_master_playlist/ https://*.dropboxusercontent.com/p/hls_playlist/ ; img-src https://* data: blob: ; connect-src https://* ws://127.0.0.1:*/ws blob: wss://dsimports.dropbox.com/ ; object-src 'self' https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ ; base-uri 'self' ; media-src https://* blob:
< content-type: text/html; charset=utf-8
content-type: text/html; charset=utf-8
< location: https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1#
location: https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1#
< pragma: no-cache
pragma: no-cache
< referrer-policy: strict-origin-when-cross-origin
referrer-policy: strict-origin-when-cross-origin
< set-cookie: gvc=ODUyODM5Mjc5ODY1MTk1ODYyNDEwOTI5ODU4NjU1NTIyOTU4NA==; Path=/; Expires=Sat, 12 Jan 2030 10:37:17 GMT; HttpOnly; Secure; SameSite=None
set-cookie: gvc=ODUyODM5Mjc5ODY1MTk1ODYyNDEwOTI5ODU4NjU1NTIyOTU4NA==; Path=/; Expires=Sat, 12 Jan 2030 10:37:17 GMT; HttpOnly; Secure; SameSite=None
< set-cookie: t=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Domain=dropbox.com; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=None
set-cookie: t=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Domain=dropbox.com; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=None
< set-cookie: __Host-js_csrf=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; Secure; SameSite=None
set-cookie: __Host-js_csrf=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; Secure; SameSite=None
< set-cookie: __Host-ss=4WDPeTzG94; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=Strict
set-cookie: __Host-ss=4WDPeTzG94; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=Strict
< set-cookie: locale=en; Path=/; Domain=dropbox.com; Expires=Sat, 12 Jan 2030 10:37:17 GMT
set-cookie: locale=en; Path=/; Domain=dropbox.com; Expires=Sat, 12 Jan 2030 10:37:17 GMT
< x-content-type-options: nosniff
x-content-type-options: nosniff
< x-permitted-cross-domain-policies: none
x-permitted-cross-domain-policies: none
< x-robots-tag: noindex, nofollow, noimageindex
x-robots-tag: noindex, nofollow, noimageindex
< x-xss-protection: 1; mode=block
x-xss-protection: 1; mode=block
< content-length: 17
content-length: 17
< date: Mon, 13 Jan 2025 10:37:18 GMT
date: Mon, 13 Jan 2025 10:37:18 GMT
< strict-transport-security: max-age=31536000; includeSubDomains
strict-transport-security: max-age=31536000; includeSubDomains
< server: envoy
server: envoy
< cache-control: no-cache, no-store
cache-control: no-cache, no-store
< x-dropbox-response-origin: far_remote
x-dropbox-response-origin: far_remote
< x-dropbox-request-id: bd026e63f07b461c861a0141d529217a
x-dropbox-request-id: bd026e63f07b461c861a0141d529217a
< 

* Ignoring the response-body
* Connection #0 to host www.dropbox.com left intact
* Issue another request to this URL: 'https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1'
* Host uceb23753f12718e072510c60961.dl.dropboxusercontent.com:443 was resolved.
* IPv6: (none)
* IPv4: 162.125.83.15
*   Trying 162.125.83.15:443...
* Connected to uceb23753f12718e072510c60961.dl.dropboxusercontent.com (162.125.83.15) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=*.dl.dropboxusercontent.com
*  start date: Mar 25 00:00:00 2024 GMT
*  expire date: Mar 11 23:59:59 2025 GMT
*  subjectAltName: host "uceb23753f12718e072510c60961.dl.dropboxusercontent.com" matched cert's "*.dl.dropboxusercontent.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1
* [HTTP/2] [1] [:method: HEAD]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: uceb23753f12718e072510c60961.dl.dropboxusercontent.com]
* [HTTP/2] [1] [:path: /cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> HEAD /cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1 HTTP/2
> Host: uceb23753f12718e072510c60961.dl.dropboxusercontent.com
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 302 
HTTP/2 302 
< content-type: application/json
content-type: application/json
< cache-control: no-cache
cache-control: no-cache
< content-security-policy: sandbox
content-security-policy: sandbox
< etag: 1736727306872417d
etag: 1736727306872417d
< location: /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1
location: /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1
< referrer-policy: no-referrer
referrer-policy: no-referrer
< set-cookie:  uc_session=bCkHnAr1AtATJck2rXB9JJ1dFnejeUFMPoOG3zyTqNjJTCmmuVgsU2CnJorvk8lG; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
set-cookie:  uc_session=bCkHnAr1AtATJck2rXB9JJ1dFnejeUFMPoOG3zyTqNjJTCmmuVgsU2CnJorvk8lG; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
< vary: Origin, Accept-Encoding
vary: Origin, Accept-Encoding
< x-robots-tag: noindex, nofollow, noimageindex
x-robots-tag: noindex, nofollow, noimageindex
< date: Mon, 13 Jan 2025 10:37:18 GMT
date: Mon, 13 Jan 2025 10:37:18 GMT
< server: envoy
server: envoy
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-dropbox-response-origin: far_remote
x-dropbox-response-origin: far_remote
< x-dropbox-request-id: df1e5c7fd8874d38bd1ab5b931da2536
x-dropbox-request-id: df1e5c7fd8874d38bd1ab5b931da2536
< 

* Ignoring the response-body
* Connection #1 to host uceb23753f12718e072510c60961.dl.dropboxusercontent.com left intact
* Issue another request to this URL: 'https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1'
* Found bundle for host: 0x600002705d40 [can multiplex]
* Re-using existing connection with host uceb23753f12718e072510c60961.dl.dropboxusercontent.com
* [HTTP/2] [3] OPENED stream for https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1
* [HTTP/2] [3] [:method: HEAD]
* [HTTP/2] [3] [:scheme: https]
* [HTTP/2] [3] [:authority: uceb23753f12718e072510c60961.dl.dropboxusercontent.com]
* [HTTP/2] [3] [:path: /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1]
* [HTTP/2] [3] [user-agent: curl/8.7.1]
* [HTTP/2] [3] [accept: */*]
> HEAD /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1 HTTP/2
> Host: uceb23753f12718e072510c60961.dl.dropboxusercontent.com
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 200 
HTTP/2 200 
< content-type: application/json
content-type: application/json
< accept-ranges: bytes
accept-ranges: bytes
< cache-control: max-age=60
cache-control: max-age=60
< content-disposition: attachment; filename=unspecified
content-disposition: attachment; filename=unspecified
< content-security-policy: sandbox
content-security-policy: sandbox
< content-security-policy: report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-usercontent ; sandbox allow-forms allow-scripts allow-top-navigation allow-popups
content-security-policy: report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-usercontent ; sandbox allow-forms allow-scripts allow-top-navigation allow-popups
< content-security-policy: form-action 'none' ; report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-noscript ; script-src 'none'
content-security-policy: form-action 'none' ; report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-noscript ; script-src 'none'
< etag: 1736727306872417d
etag: 1736727306872417d
< pragma: public
pragma: public
< referrer-policy: no-referrer
referrer-policy: no-referrer
< set-cookie:  uc_session=6XxzAhBTSoFY5pgkaN63ZY50rYJq2ZY8a0dsXRwxXNAX2q0CH2z5cMIEvtd0Ek08; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
set-cookie:  uc_session=6XxzAhBTSoFY5pgkaN63ZY50rYJq2ZY8a0dsXRwxXNAX2q0CH2z5cMIEvtd0Ek08; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
< vary: Origin, Accept-Encoding
vary: Origin, Accept-Encoding
< x-content-security-policy: sandbox
x-content-security-policy: sandbox
< x-content-type-options: nosniff
x-content-type-options: nosniff
< x-robots-tag: noindex, nofollow, noimageindex
x-robots-tag: noindex, nofollow, noimageindex
< x-server-response-time: 187
x-server-response-time: 187
< x-webkit-csp: sandbox
x-webkit-csp: sandbox
< date: Mon, 13 Jan 2025 10:37:18 GMT
date: Mon, 13 Jan 2025 10:37:18 GMT
< server: envoy
server: envoy
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< content-length: 2146
content-length: 2146
< x-dropbox-response-origin: far_remote
x-dropbox-response-origin: far_remote
< x-dropbox-request-id: da6721bade384c63b8a156a1f059b964
x-dropbox-request-id: da6721bade384c63b8a156a1f059b964
< 

@webknjaz
Copy link
Member

Try passing requote_redirect_url=False when initializing the client session: https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp.ClientSession.requote_redirect_url.

You may also disable the redirects and walk them manually, verifying that the Location header value is quoted appropriately.

@webknjaz
Copy link
Member

@jonathon-love also note that curl goes for HTTP/2 which aiohttp won't upgrade. That's another difference.

@jonathon-love
Copy link
Author

thanks for your help.

adding requote_redirect_url=False doesn't help:

async def download_file(url):
    async with aiohttp.ClientSession(requote_redirect_url=False) as session:
        async with session.head(url, allow_redirects=True) as resp:
            pass

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))

and the addition of --http1.1 to curl continues to work, i.e.

curl -I -L -v --http1.1 "https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1"

one thing worth noting is that this is a .head() call, rather than a .get().

        async with session.head(url, allow_redirects=True) as resp:
            pass

if i change this to a .get() call it all works as expected.

with thanks

@Dreamsorcerer
Copy link
Member

Is the issue present with AIOHTTP_NO_EXTENSIONS=1?

I'll try and reproduce in a couple of days.

@jonathon-love
Copy link
Author

it works when AIOHTTP_NO_EXTENSIONS=1 is set!

@Dreamsorcerer
Copy link
Member

I wonder if there's something we're missing, to tell the C parser that it is a HEAD response. I suspect it's trying to parse a full response..

@bdraco
Copy link
Member

bdraco commented Mar 17, 2025

I tried to reproduce this but I'm not seeing a failure

bdraco@MacBook-Pro-37 aiohttp % python -m pip show aiohttp
Name: aiohttp
Version: 3.11.11
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache-2.0
Location: /opt/homebrew/lib/python3.13/site-packages
Requires: aiohappyeyeballs, aiosignal, attrs, frozenlist, multidict, propcache, yarl
Required-by: aioharmony, aiohttp-asyncmdnsresolver, aioresponses, aioshelly, govee-api-laggat, nexia, pytest-aiohttp, python-kasa, snitun
bdraco@MacBook-Pro-37 aiohttp % python -m pip show yarl
Name: yarl
Version: 1.18.3
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /opt/homebrew/lib/python3.13/site-packages
Requires: idna, multidict, propcache
Required-by: aiohttp, aioshelly, onvif-zeep-async
bdraco@MacBook-Pro-37 aiohttp % cat down.py 
import aiohttp
import asyncio

async def download_file(url):
    async with aiohttp.ClientSession() as session:
        async with session.head(url, allow_redirects=True) as resp:
            print(resp.status)

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))
bdraco@MacBook-Pro-37 aiohttp % python3 down.py           
200
bdraco@MacBook-Pro-37 aiohttp % 

@jonathon-love
Copy link
Author

still reproducible for me.

(fred-py3.13) c3113592@CCW210-9M0XYPP aiohttp % python -m pip show aiohttp   
Name: aiohttp
Version: 3.11.14
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache-2.0
Location: /Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages
Requires: aiohappyeyeballs, aiosignal, attrs, frozenlist, multidict, propcache, yarl
Required-by: 
(fred-py3.13) c3113592@CCW210-9M0XYPP aiohttp % python3 down.py           
Traceback (most recent call last):
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client_proto.py", line 264, in data_received
    messages, upgraded, tail = self._parser.feed_data(data)
                               ~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "aiohttp/_http_parser.pyx", line 558, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1059, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/streams.py", line 672, in read
    await self._waiter
aiohttp.http_exceptions.HttpProcessingError: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/c3113592/Downloads/aiohttp/down.py", line 10, in <module>
    asyncio.run(download_file(url))
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 720, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/c3113592/Downloads/aiohttp/down.py", line 6, in download_file
    async with session.head(url, allow_redirects=True) as resp:
               ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client.py", line 730, in _request
    await resp.start(conn)
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1061, in start
    raise ClientResponseError(
    ...<5 lines>...
    ) from exc
aiohttp.client_exceptions.ClientResponseError: 400, message="Invalid character in chunk size:\n\n  b'\\x1f\\x8b\\x08'\n    ^", url='https://ucbabf37dbb083a5658d25841818.dl.dropboxusercontent.com/cd/0/get/CmADiigmRvYWgWUOajNrs0T3vE5W8M9cuQ2Rd6sYEUx29W76vPs1lb1SKq4LRo3fkCIQYv3HgzR-64cmX-1SSZfCe7fI6NXlRTZBY4KIwmbTC5_nOV4-OgHzq4R8nv6l1KayTE1rPe5LqBaMAeR5A7RR/file?dl=1'

it works if i set AIOHTTP_NO_EXTENSIONS=1 (or AIOHTTP_NO_EXTENSIONS=0 for that matter)

jonathon

@bdraco
Copy link
Member

bdraco commented Mar 17, 2025

Let me try running it in another directory just in case something is leaking across in my dev setup

@bdraco
Copy link
Member

bdraco commented Mar 17, 2025

I can reproduce it when I do it in another directory

bdraco@MacBook-Pro-37 NEW_DIR % python3 -m pip show aiohttp
Name: aiohttp
Version: 3.11.14
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache-2.0
Location: /opt/homebrew/lib/python3.13/site-packages
Requires: aiohappyeyeballs, aiosignal, attrs, frozenlist, multidict, propcache, yarl
Required-by: aioharmony, aiohttp-asyncmdnsresolver, aioresponses, aioshelly, govee-api-laggat, nexia, pytest-aiohttp, python-kasa, snitun
bdraco@MacBook-Pro-37 NEW_DIR % cat issue_10322.py 
import aiohttp
import asyncio

async def download_file(url):
    async with aiohttp.ClientSession() as session:
        async with session.head(url, allow_redirects=True) as resp:
            print(resp.status)

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))
bdraco@MacBook-Pro-37 NEW_DIR % python3 issue_10322.py 
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client_proto.py", line 264, in data_received
    messages, upgraded, tail = self._parser.feed_data(data)
                               ~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "aiohttp/_http_parser.pyx", line 558, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1059, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/streams.py", line 672, in read
    await self._waiter
aiohttp.http_exceptions.HttpProcessingError: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/bdraco/NEW_DIR/issue_10322.py", line 11, in <module>
    asyncio.run(download_file(url))
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/bdraco/NEW_DIR/issue_10322.py", line 6, in download_file
    async with session.head(url, allow_redirects=True) as resp:
               ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client.py", line 730, in _request
    await resp.start(conn)
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1061, in start
    raise ClientResponseError(
    ...<5 lines>...
    ) from exc
aiohttp.client_exceptions.ClientResponseError: 400, message="Invalid character in chunk size:\n\n  b'\\x1f\\x8b\\x08'\n    ^", url='https://ucaffda43d36f7f2a3e4ec1034fb.dl.dropboxusercontent.com/cd/0/get/CmCv4cA8lE1F9e5DnpD8_diG0sDToNANIf8jHKHrxvsfsdDIKgXJURqhhK3B6mUNc3IhWlovgsPQJauoHaBPAg9u7oGpMUomsSloSGS50gj7VfF_AGiUb1nd7j95qFAmh4e29T_mh9ZgtO-BGgfRO6sJ/file?dl=1'
bdraco@MacBook-Pro-37 NEW_DIR % 

@bdraco
Copy link
Member

bdraco commented Mar 17, 2025

Here is the transaction

['->',
 b'HEAD /scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8ud'
 b'mlxsosn&st=2owyrsrs&dl=1 HTTP/1.1\r\nHost: www.dropbox.com\r\nAccept: */'
 b'*\r\nAccept-Encoding: gzip, deflate, br\r\nUser-Agent: Python/3.13 aiohttp/3'
 b'.11.15.dev0\r\n\r\n']
['<-',
 b'HTTP/1.1 302 Found\r\nContent-Security-Policy: frame-src https://* carouse'
 b'l: dbapi-6: dbapi-7: dbapi-8: dropbox-client: itms-apps: itms-appss: blob: ;'
 b" base-uri 'self' ; default-src https://www.dropbox.com/playlist/ https://www"
 b'.dropbox.com/v/s/playlist/ https://*.dropboxusercontent.com/p/hls_master_pla'
 b'ylist/ https://*.dropboxusercontent.com/p/hls_playlist/ ; connect-src https:'
 b"//* ws://127.0.0.1:*/ws blob: wss://dsimports.dropbox.com/ ; script-src 'uns"
 b"afe-eval' 'inline-speculation-rules' https://www.dropbox.com/static/api/ htt"
 b'ps://www.dropbox.com/pithos/ https://cfl.dropboxstatic.com/static/ https://w'
 b'ww.dropboxstatic.com/static/ https://accounts.google.com/gsi/client https://'
 b'canny.io/sdk.js https://www.paypal.com/sdk/js https://www.google.com/recaptc'
 b"ha/ https://www.gstatic.com/recaptcha/ 'unsafe-inline' ; child-src https://w"
 b'ww.dropbox.com/static/serviceworker/ blob: ; report-uri https://www.dropbox.'
 b"com/csp_log?policy_name=metaserver-whitelist ; object-src 'self' https://cfl"
 b'.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ ; worker-sr'
 b'c https://www.dropbox.com/static/serviceworker/ https://www.dropbox.com/encr'
 b'ypted_folder_download/service_worker.js https://www.dropbox.com/service_work'
 b'er.js blob: ; media-src https://* blob: ; form-action https://docs.google.co'
 b'm/document/fsip/ https://docs.google.com/spreadsheets/fsip/ https://docs.goo'
 b'gle.com/presentation/fsip/ https://docs.sandbox.google.com/document/fsip/ ht'
 b'tps://docs.sandbox.google.com/spreadsheets/fsip/ https://docs.sandbox.google'
 b'.com/presentation/fsip/ https://*.purple.officeapps.live-int.com https://off'
 b'iceapps-df.live.com https://*.officeapps-df.live.com https://officeapps.live'
 b'.com https://*.officeapps.live.com https://paper.dropbox.com/cloud-docs/edit'
 b" 'self' https://www.dropbox.com/ https://dl-web.dropbox.com/ https://photos."
 b'dropbox.com/ https://paper.dropbox.com/ https://showcase.dropbox.com/ https:'
 b'//www.hellofax.com/ https://app.hellofax.com/ https://www.hellosign.com/ htt'
 b'ps://app.hellosign.com/ https://docsend.com/ https://www.docsend.com/ https:'
 b'//help.dropbox.com/ https://navi.dropbox.jp/ https://a.sprig.com/ https://se'
 b'lfguidedlearning.dropboxbusiness.com/ https://instructorledlearning.dropboxb'
 b'usiness.com/ https://sales.dropboxbusiness.com/ https://accounts.google.com/'
 b' https://api.login.yahoo.com/ https://login.yahoo.com/ https://experience.dr'
 b'opbox.com/ https://pal-test.adyen.com https://2e83413d8036243b-Dropbox-pal-l'
 b"ive.adyenpayments.com/ https://onedrive.live.com/picker ; frame-ancestors 's"
 b"elf' https://*.dropbox.com ; style-src https://* 'unsafe-inline' 'unsafe-eva"
 b"l' ; font-src https://* data: ; img-src https://* data: blob:\r\nContent-T"
 b'ype: text/html; charset=utf-8\r\nLocation: https://uc97cdcb1e623b4ad263e12'
 b'14499.dl.dropboxusercontent.com/cd/0/get/CmDw-aWRq7E0XBjxPV96OzeiNRFgdjbRxji'
 b'rg7Y7i8Je3lRcaI4-JoPGCg0Fz4PYZ3kKHlHe4me7l0s24_HSfWMfwqHztXRsZIPSJafmlGC8Elc'
 b'mCpmI1bQIb7HetVsbHkkZpKiQlrUFLVfi08Cpi9VT/file?dl=1#\r\nPragma: no-cache\r\n'
 b'Referrer-Policy: strict-origin-when-cross-origin\r\nSet-Cookie: gvc=NzE2NT'
 b'MxMjg2MjYzNDk0OTY5MDY0NTA3NTIwNTU5MzczMzkyMTU=; Path=/; Expires=Sat, 16 Mar '
 b'2030 03:26:05 GMT; HttpOnly; Secure; SameSite=None\r\nSet-Cookie: t=siVqvo'
 b'_-BTNavohLU2HJr4L5; Path=/; Domain=dropbox.com; Expires=Tue, 17 Mar 2026 03:'
 b'26:05 GMT; HttpOnly; Secure; SameSite=None\r\nSet-Cookie: __Host-js_csrf=s'
 b'iVqvo_-BTNavohLU2HJr4L5; Path=/; Expires=Tue, 17 Mar 2026 03:26:05 GMT; Secu'
 b're; SameSite=None\r\nSet-Cookie: __Host-ss=GH9qslTt60; Path=/; Expires=Tue'
 b', 17 Mar 2026 03:26:05 GMT; HttpOnly; Secure; SameSite=Strict\r\nSet-Cooki'
 b'e: locale=en; Path=/; Domain=dropbox.com; Expires=Sat, 16 Mar 2030 03:26:05 '
 b'GMT\r\nX-Content-Type-Options: nosniff\r\nX-Permitted-Cross-Domain-Policies:'
 b' none\r\nX-Robots-Tag: noindex, nofollow, noimageindex\r\nX-Xss-Protection: '
 b'1; mode=block\r\nContent-Length: 17\r\nDate: Mon, 17 Mar 2025 03:26:06 G'
 b'MT\r\nStrict-Transport-Security: max-age=31536000; includeSubDomains\r\nServ'
 b'er: envoy\r\nCache-Control: no-cache, no-store\r\nX-Dropbox-Response-Origin:'
 b' far_remote\r\nX-Dropbox-Request-Id: d5979a6a499749ba9f0cf759344ca4a0\r'
 b'\n\r\n']
['->',
 b'HEAD /cd/0/get/CmDw-aWRq7E0XBjxPV96OzeiNRFgdjbRxjirg7Y7i8Je3lRcaI4-JoPGCg0Fz'
 b'4PYZ3kKHlHe4me7l0s24_HSfWMfwqHztXRsZIPSJafmlGC8ElcmCpmI1bQIb7HetVsbHkkZpKiQl'
 b'rUFLVfi08Cpi9VT/file?dl=1 HTTP/1.1\r\nHost: uc97cdcb1e623b4ad263e1214499.d'
 b'l.dropboxusercontent.com\r\nAccept: */*\r\nAccept-Encoding: gzip, deflate, b'
 b'r\r\nUser-Agent: Python/3.13 aiohttp/3.11.15.dev0\r\n\r\n']
['<-',
 b'HTTP/1.1 200 OK\r\nContent-Type: application/json\r\nAccept-Ranges: byte'
 b's\r\nCache-Control: max-age=60\r\nContent-Disposition: attachment; filename='
 b'"Tooth Growth.omv"; filename*=UTF-8\'\'Tooth%20Growth.omv\r\nContent-Securit'
 b'y-Policy: sandbox\r\nPragma: public\r\nReferrer-Policy: no-referrer\r\nVar'
 b'y: Origin, Accept-Encoding\r\nX-Content-Security-Policy: sandbox\r\nX-Conten'
 b't-Type-Options: nosniff\r\nX-Robots-Tag: noindex, nofollow, noimageindex\r\n'
 b'X-Server-Response-Time: 384\r\nX-Webkit-Csp: sandbox\r\nDate: Mon, 17 Mar 20'
 b'25 03:26:06 GMT\r\nServer: envoy\r\nStrict-Transport-Security: max-age=31536'
 b'000; includeSubDomains; preload\r\nContent-Encoding: gzip\r\nX-Dropbox-Respo'
 b'nse-Origin: far_remote\r\nX-Dropbox-Request-Id: 26c0a8d34e4b478c8d8dd39bc1'
 b'f429dc\r\nTransfer-Encoding: chunked\r\n\r\n\x1f\x8b\x08\x00\x00\x00'
 b'\x00\x00\x00\x03\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00']

@bdraco
Copy link
Member

bdraco commented Mar 17, 2025

so dropbox is sending a response body on a HEAD request.

curl hint: * Ignoring the response-body

@bdraco bdraco changed the title aiohttp barfs on dropbox download with "Invalid character in chunk size" aiohttp does not skip response body when a HEAD request response has a body Mar 17, 2025
@bdraco bdraco changed the title aiohttp does not skip response body when a HEAD request response has a body aiohttp does not skip response body when a HEAD request response has a body when using C extensions Mar 17, 2025
@jonathon-love
Copy link
Author

love your work!

@bdraco
Copy link
Member

bdraco commented Mar 17, 2025

https://datatracker.ietf.org/doc/html/rfc9112#section-6.3-2.1 is pretty clear that its not allowed to have a body:

Any response to a HEAD request and any response with a 1xx (Informational), 204 (No Content), or 304 (Not Modified) status code is always terminated by the first empty line after the header fields, regardless of the header fields present in the message, and thus cannot contain a message body or trailer section.

aiohttp is uses llhttp for the c parser https://github.com/nodejs/llhttp

It seems like curl is more forgiving.

I think there could be an argument made to be that llhttp should be as forgiving as curl and discard the unexpected body. I don't think there are any security or request smuggling implications to doing that (someone else needs to validate this statement). However thats something the llhttp maintainers would need to decide. I'd suggest continuing at https://github.com/nodejs/llhttp/issues?q=sort%3Aupdated-desc+is%3Aissue+is%3Aopen and if they decide to implement being forgiving about this violation we can update the version of llhttp we bundle. I should note that if they decide to implement it in their lenient mode we probably can't use it because the security implications of turning that on by default.

@Dreamsorcerer
Copy link
Member

I don't think there are any security or request smuggling implications to doing that

I'm not so sure. If one parser is reading the next request (as it should do), and another parser is reading a body, that's a request smuggling attack.

There would literally be no way to distinguish between a body which looks like a request and a new request. Therefore I think that the bytes following the headers must always be treated as a new request, which is why there is a parsing error. At best, we could silence the parsing error in lax mode and just close the connection. The Python parser should probably also do the same behaviour (I assume it doesn't given your first attempt didn't error).

As curl sends and receives single requests, I assume that it doesn't use keep-alive connections, in which case it's not really a security issue for curl.

@Dreamsorcerer
Copy link
Member

Actually, the error says "Invalid character in chunk size". Could this actually be the opposite issue? That llhttp is trying to read the body when it's not supposed to. If llhttp was processing it as a new request (as it should be), then the error message should have been about an invalid response line.

@Dreamsorcerer
Copy link
Member

I'm wondering if there's a mistake around here:

if (
ULLONG_MAX > self._cparser.content_length > 0 or chunked or
self._cparser.method == cparser.HTTP_CONNECT or
(self._cparser.status_code >= 199 and
self._cparser.content_length == 0 and
self._read_until_eof)
):
payload = StreamReader(
self._protocol, timer=self._timer, loop=self._loop,
limit=self._limit)
else:
payload = EMPTY_PAYLOAD
self._payload = payload
if encoding is not None and self._auto_decompress:
self._payload = DeflateBuffer(payload, encoding)
if not self._response_with_body:
payload = EMPTY_PAYLOAD

It looks like it'd assign a SteamReader to self._payload, then add an EMPTY_PAYLOAD to the messages. I wonder if it should be setting to an empty payload at the start..

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Mar 17, 2025

Created a test in #10587. Probably not going to look at it just yet (my guess at fixing it was wrong), but the C parser is behaving incorrectly and trying to parse the body.

If that is fixed, it might help out with this issue, as it may allow the response to be received and only error on the next request (which I assume is what happens with the Python parser).

Though the overall fix is obviously needed from Dropbox, who are sending invalid HTTP responses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants