Skip to content

gh-129005: Align FileIO.readall allocation #129424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Jan 29, 2025

Both _io and _pyio now use a pre-allocated buffer of length bufsize, fill it using a os.readinto / _Py_read, and have matching "expand buffer" logic.

On my machine (Linux, Debug build) this takes: ./python -m test -M8g -uall test_largefile -m test_large_read -v from ~3.7 seconds to ~3.3 seconds

_pyio still uses 2x the memory, there are two remaining copies

  1. the bytes(result) currently copies. I'd like to either just rely on "duck typing" / bytearray is close enough to bytes, or would need to do something similar to C++ "move" semantics where the bytes could take ownership of the data buffer from the bytearray without copying... Not sure what is most Pythonic
  2. _pyio.BufferedIO._read_unlocked in the read-all case where no buffer has been allocated always does return buf[:pos] + chunk which causes another copy. Patch for that case / "if buf is length 0, just return" coming shortly.

Both now use a pre-allocated buffer of length `bufsize`, fill it using
a readinto, and have matching "expand buffer" logic.

On my machine this takes:

`./python -m test -M8g -uall test_largefile -m test_large_read -v`
from ~3.7 seconds to ~3.3 seconds
@cmaloney
Copy link
Contributor Author

cmaloney commented Jan 29, 2025

Not sure how to get the Tests / Ubuntu / build and test unbroken, they seem to be auto-cancelling fast... which isn't happening on any of my other open PRs.

@cmaloney cmaloney closed this Jan 29, 2025
@cmaloney
Copy link
Contributor Author

rebased on main and opened #129458 as a replacement that hopefully will revoid the cancelled test runs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant