-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime/netpoll: heavy epoll_wait activity on AF_LOCAL sockets we're not really waiting for #17249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This should be fine per se, because we use EPOLLET. |
CentOS 7 - 3.10.0-327.36.1.el7.x86_64 EPOLLET does not guarantee one will only get exactly one event in this case, see http://cmeerw.org/blog/753.html#753 (or https://github.com/bjaspan/epoll-test for a little bit different case). |
Netpoller does not rely on the fact that there will be only one notification. Any finite number of notifications should be OK. |
Yes, functionally everything should be correct, but the thousands of syscalls we are seeing every second could be avoided. That's more of a performance issue, for sure. |
@tpotega what exactly you wanted to point out in these links? You said that exactly one event one event is not guaranteed. So I thought you are pointing to:
Thus my reply -- getting one or two notifications is irrelevant for netpoller. Is there a description as to why/when zillions of notifications happen? |
The links were meant as just an example of (maybe) unexpected EPOLLET behavior. As to why/when notification happen - that's a longer story. TL;DR: zillions of notifications don't appear on 3.10.0-327.28.2.el7.x86_64 and up. As for the 3.10.0-327.18.2.el7 code path:
A notification will be generated provided there is buffer space available (a new event every time, even with EPOLLET). With many packets received, expect even more than zillions of notifications. See also: http://vger.kernel.org/~davem/skb_sk.html But now, the bomb: some recent AF_LOCAL changes backported to 3.10.0-327.28.2.el7 (and up) hide the problem. |
If this is fixed in newer kernels (where "newer" means after 2013), I don't think there's much we can do in Go. Though I guess 2013 is still rather recent. /cc @rsc @aclements Should we workaround this kernel bug? |
I don't see any way to fix this without noticeably hurting performance for the normal case, because we will have to shuffle descriptors in and out of the |
I agree with @ianlancetaylor. I don't see how to handle this without having to constantly change the state of the epoll descriptor. However, I'm a bit confused about when this was actually fixed upstream. @tpotega, you mentioned these changes were backported to 3.10.0. Do you know what upstream change they were backported from? Based on the version numbers you gave, I assume this is a RedHat kernel, but I can't figure out how to get any useful log of their backporting activity. |
Amusingly, it was me who found and reported the kernel bug... |
😄 |
@aclements and I were able to reproduce this on a machine with http://kernel.ubuntu.com/git/ubuntu/ubuntu-trusty.git/tag/?h=Ubuntu-lts-4.2.0-22.27_14.04.1 We had to change the example to This obviously isn't ideal, but it's also a small and bounded amount of work. We think that it is not worth putting a workaround into Go for this. If the slight extra CPU is causing problems for your application, we recommend upgrading to a newer kernel. |
What version of Go are you using (
go version
)?go version go1.7.1 linux/amd64
What operating system and processor architecture are you using (
go env
)?GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GORACE=""
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build017533435=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
Description
Golang's epoll-based poller uses EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET on all the sockets it listens on (including unix domain sockets), even those that had not yet been used, and thus did not yet have a chance to return EAGAIN during read/write. This differs a bit from the man-page suggested usage scenario.
This might result in high netpoll activity - with thousands of epoll_wait()'s constantly returning a "ready-to-write" AF_LOCAL socket (a syslog-ng socket in my example).
How to reproduce
Connect to a busy syslog-ng unixgram socket, make netpoll wait on some other network operation - a GET on localhost:12345 in this example:
Redacted strace output
Expected behavior
It seems the poller should not add the fd's we're not really waiting for with the EPOLLIN|EPOLLOUT flags.
The text was updated successfully, but these errors were encountered: