Skip to content

proposal: add io.LimitedWriter to make more convenient to implement zero copy #46465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jim3ma opened this issue May 31, 2021 · 21 comments
Closed

Comments

@jim3ma
Copy link

jim3ma commented May 31, 2021

Currently, io.LimitedReader make it easy in x.ReadFrom. Sometimes, we need to make code changes in y.WriteTo, because type y holds the underlaying connections we can not get in x.ReadFrom. So we need the io.LimitedWriter and adding io.LimitedWriter to match io.LimitedReader seems plausible

Use case: add http.persistConn.WriteTo to make zero copy possible

http.persistConn holds the conn, we can not get it in other packages.
Implement http.persistConn.WriteTo to call conn.WriteTo is a good way (conn.WriteTo is the next use case).

Code changes [WIP]:
jim3ma@5faae18#diff-f2f92ffe0abe8dd3c833d435c2d859d54380e8e4160af8becab6945395563cfeR1939

Use case: add net.TCPConn.WriteTo func to enable splice socket data to file

When transfer data from tcp socket to file, splice will help us with low time cost in linux.

We get 10% at least cpu time reduce, zero memory copy from kernel to user space.

image

Implement PR: #46149

About why add code changes with io.LimitedWriter:

First, I want to use io.LimitedReader to implement zero copy, but can not make changes to os.File due to cycle import.
net imports os, we can not detect r.(*net.TCPConn). The only way we can do is implement zero copy on net.TCPConn, but we need a type like io.LimitedReader to holds the count of writing bytes.

Code changes:
jim3ma@c0c9d9a#diff-0adad05e90d38731c53f2b006db41d2cdb4772b55f83bdd98486a9e2f95d1db4R54

@gopherbot gopherbot added this to the Proposal milestone May 31, 2021
@ianlancetaylor
Copy link
Contributor

I'm not sure this needs to be a proposal.

@jim3ma jim3ma changed the title proposal: net: add tcp WriteTo func to enable splice socket data to file proposal: add io.LimitedWriter to make more convenient to implement zero copy Jun 1, 2021
@jim3ma
Copy link
Author

jim3ma commented Jun 1, 2021

@ianlancetaylor I have update this issue.

Because we need add a new type io.LimiterWriter, I think proposal is appropriate.

@ianlancetaylor
Copy link
Contributor

So we need the io.LimitedWriter and adding io.LimitedWriter to match io.LimitedReader seems plausible.

I'm sorry, I do not understand this explanation.

Why do we need io.LimitedWriter? What would happen if we didn't have it?

@jim3ma
Copy link
Author

jim3ma commented Jun 2, 2021

We need the io.LimitedWriter to get the writing limited size when call WriteTo, like need the io.LimitedReader to get the reading limited size when call ReadFrom.

@ianlancetaylor
Copy link
Contributor

@jim3ma For ReadFrom you have the causality backward. ReadFrom does not use LimitedReader. Instead, if the caller happens to be using LimitedReader, ReadFrom recognizes and uses that fact.

Since LimitedWriter does not currently exist, no code will be using LimitedWriter. So the reason that ReadFrom supports LimitedReader does not exist for WriteTo. Why do we need to add it?

@jim3ma
Copy link
Author

jim3ma commented Jun 3, 2021

Example: jim3ma@5faae18#diff-f2f92ffe0abe8dd3c833d435c2d859d54380e8e4160af8becab6945395563cfeR2818

Without WriteTo, all data from bodyEOFSignal is transferring by Read. Example, when w is os.File, net.TCPConn data is copied to user space first, then to page cache, no zero copy.

With WriteTo and w is os.File, we can use the underlaying bodyEOFSignal.src with zero copy. In this case, bodyEOFSignal.src is a io.LimitedReader, and io.LimitedReader.R is a buffer which holds a net.TCPConn, zero copy is implemented in net.TCPConn.WriteTo.

We need to limit the writing bytes wrote to os.File by calling:

buffer.WriteTo(io.LimitWriter(os.File, n))

So io.LimitedWriter is an appropriate solution.

@jim3ma
Copy link
Author

jim3ma commented Jun 3, 2021

Summary:
When zero copy is implemented in ReadFrom, ReadFrom recognizes io.LimitedReader and uses that fact.
When zero copy is implemented in WriteTo, WriteTo recognizes io.LimitedWriter and uses that fact.

Why implement in x.WriteTo(y): because x holds the underlaying fd which y can not get it.

@jim3ma
Copy link
Author

jim3ma commented Jun 3, 2021

@rsc can you help us review this proposal ?
Like #45899, adding io.LimitedWriter to match io.LimitedReader seems plausible.

@svenwiltink
Copy link

Wouldn't it make more sense to add this speedup to the ReadFrom method of an *os.File. By levering the reader instead of the writer you can reuse the io.LimitedReader and don't need to add a limitedwriter.

@ianlancetaylor
Copy link
Contributor

We need to limit the writing bytes wrote to os.File by calling:

Why do you need to limit the number of bytes written to the file? Who needs to impose that limit?

If the need is for the WriteTo method to detect a LimitedReader and pass that on, then there is no need for an exported LimitedWriter API. We can use an internal API to implement that without adding a new public io.LimitedWriter API.

@svenwiltink
Copy link

Why do you need to limit the number of bytes written to the file? Who needs to impose that limit?

The timing is uncanny, but I actually have a usecase. I'm transmitting large files over a tcp connection but in a binary format containing the offset, length of data (lets call this the header) and then the data itself. For each section in the tcp stream the header is parsed. The header information is used to seek to the correct location in the target file. After that an io.LimitedReader is used in conjunction with io.Copy. The copying of the data can be offloaded to splice as no further parsing is needed until the next section of data.

@ianlancetaylor
Copy link
Contributor

Thanks. I completely understand why user code needs to use io.LimitedReader. My question is why user code would ever need to use io.LimitedWriter. We only want to add a new exported API if it is useful for user code. The need to handle io.LimitedReader with WriteTo does not require new exported API, it only requires new internal API.

@svenwiltink
Copy link

Agreed. From a users perspective I also don't see a reason to expose a new API. I think this was added by the author of this issue to make the implementation easier, but it would make more sense to make this an implementation detail of the aforementioned methods.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/324473 mentions this issue: net: initial legwork for a fast path for TCPConn to os.File writes

@svenwiltink
Copy link

@ianlancetaylor Added a very rudimentary implementation of a fast path for TCPConn to *os.File. With support for an io.LimitedReader as well. Not particularly happy with the implementation yet so all and any feedback is more than welcome. Tests and stubs for other GOOS than unix are missing as well.

Also not sure if this is the correct issue to track or if I should create a new one that supersedes this one. This being my first contribution to the go project I'm a bit lost ;)

@ianlancetaylor
Copy link
Contributor

@svenwiltink Thanks for sending the CL. In this case there is already an active CL for this work: https://golang.org/cl/319593. I believe that @jim3ma opened this issue because I questioned the need for io.LimitedWriter on that CL.

@svenwiltink
Copy link

Ah, fair enough. I'll abandon my CL then for now. I'll add my comments to the existing CL instead to not start the same conversation in two places. Thank you for your time anyways :)

@jim3ma
Copy link
Author

jim3ma commented Jun 4, 2021

Wouldn't it make more sense to add this speedup to the ReadFrom method of an *os.File. By levering the reader instead of the writer you can reuse the io.LimitedReader and don't need to add a limitedwriter.

@svenwiltink @ianlancetaylor It's easy to speedup to the ReadFrom(net.TCPConn) method of an *os.File. But, when os.File.ReadFrom(http.Response.Body), os.File could not get the underlaying connection fd. The speeding up code should implement in http.Response.Body.WriteTo. http.Response.Body.connection has no reading limit when pass to os.File, so we need the io.LimitedWriter.

@jim3ma
Copy link
Author

jim3ma commented Jun 4, 2021

We need to limit the writing bytes wrote to os.File by calling:

Why do you need to limit the number of bytes written to the file? Who needs to impose that limit?

If the need is for the WriteTo method to detect a LimitedReader and pass that on, then there is no need for an exported LimitedWriter API. We can use an internal API to implement that without adding a new public io.LimitedWriter API.

OK, an internal API is appropriate now.

@jim3ma
Copy link
Author

jim3ma commented Jun 4, 2021

@svenwiltink @ianlancetaylor
I have changed io.LimitedWriter to iointernel.LimitedWriter in https://go-review.googlesource.com/c/go/+/319593.
Can you review the updated PR?

@ianlancetaylor
Copy link
Contributor

We are in the 1.17 release freeze right now. The patch will be reviewed for 1.18. Thanks.

Closing this issue as we no longer need this new API.

@golang golang locked and limited conversation to collaborators Jun 4, 2022
@rsc rsc moved this to Declined in Proposals Aug 10, 2022
@rsc rsc added this to Proposals Aug 10, 2022
@rsc rsc removed this from Proposals Oct 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants