Skip to content

net: vague error message from Dial("tcp", "DNS reg-name") #14296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kirillrdy opened this issue Feb 10, 2016 · 14 comments
Closed

net: vague error message from Dial("tcp", "DNS reg-name") #14296

kirillrdy opened this issue Feb 10, 2016 · 14 comments

Comments

@kirillrdy
Copy link

simple app to reproduce error

package main

import (
    "log"
    "net/http"
)

func pan(err error) {
    if err != nil {
        log.Fatal(err)
    }
}

func doRequest() {
    req, err := http.NewRequest("GET", "http://example.com/", nil)
    pan(err)
    _, err = client.Do(req)
    pan(err)
}

var client http.Client

func main() {
    i := 0
    for {
        log.Println(i)
        doRequest()
        i++
    }
}

This app doesn't close response.Body and leaves opened file descriptors.

[kirillvr@yao-local ~]$ ulimit -n 10
[kirillvr@yao-local ~]$ go version
go version go1.6rc2 linux/amd64
[kirillvr@yao-local ~]$ go run too_many_files.go 
2016/02/11 10:24:30 0
2016/02/11 10:24:30 1
2016/02/11 10:24:30 2
2016/02/11 10:24:31 3
2016/02/11 10:24:32 4
2016/02/11 10:24:32 5
2016/02/11 10:24:32 Get http://example.com/: dial tcp [2606:2800:220:1:248:1893:25c8:1946]:80: connect: network is unreachable
exit status 1
[kirillvr@yao-local ~]$ ~/go1.4/bin/go version
go version go1.4.3 linux/amd64
[kirillvr@yao-local ~]$ ~/go1.4/bin/go run too_many_files.go 
2016/02/11 10:24:49 0
2016/02/11 10:24:50 1
2016/02/11 10:24:50 2
2016/02/11 10:24:50 3
2016/02/11 10:24:51 4
2016/02/11 10:24:51 5
2016/02/11 10:24:52 6
2016/02/11 10:24:52 Get http://example.com/: dial tcp: lookup example.com: too many open files
exit status 1

go 1.4 returns correct error message, but go1.5 and 1.6 shows possibly confusing error message

@bradfitz
Copy link
Contributor

Do you have IPv6 connectivity?

@kirillrdy
Copy link
Author

I had IPV6
I have disabled it temporarily using
echo 1 > /proc/sys/net/ipv6/conf/wlp4s0/disable_ipv6

so ifconfig output

wlp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.4  netmask 255.255.255.0  broadcast 192.168.1.255
        ether 34:02:86:81:bc:35  txqueuelen 1000  (Ethernet)
        RX packets 906899  bytes 996860903 (950.6 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 546477  bytes 58817881 (56.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Still same error with go1.6

2016/02/11 11:53:31 0
2016/02/11 11:53:32 1
2016/02/11 11:53:33 2
2016/02/11 11:53:33 3
2016/02/11 11:53:34 4
2016/02/11 11:53:35 5
2016/02/11 11:53:35 Get http://example.com/: dial tcp [2606:2800:220:1:248:1893:25c8:1946]:80: connect: network is unreachable
exit status 1

@bradfitz
Copy link
Contributor

Can you post strace -f output of your test? Don't use "go run", though. That will be too noise. Use go build and then strace -f ./too_many_files.

@kirillrdy
Copy link
Author

Attached output of strace -f for both go1.4 and go1.6
go1.4.txt
go1.6.txt

@bradfitz
Copy link
Contributor

No, the error messages are accurate.

From go1.4.txt, you hit EMFILE:

[pid 16165] open("/lib64/x86_64/libnss_myhostname.so.2", O_RDONLY|O_CLOEXEC) = -1 EMFILE (Too many open files)

And on Go 1.6, you hit:

[pid 16125] connect(9, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "2606:2800:220:1:248:1893:25c8:1946", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)

If there's a bug here, it's not a regression in error messages about EMFILE.

I'd fix your IPv6 connectivity first (either fully enable or fully disable) and then debug further.

@kirillrdy
Copy link
Author

Can you confirm that the error is related to reaching number of open files?

@bradfitz
Copy link
Contributor

I can not.

If the kernel is actually returning ENETUNREACH from connect instead of EMFILE, then there's nothing we can reasonably do.

Or you're having network problems.

Actually, I think you're just having network problems. Looking at your go1.6.txt, I never see an IPv6 connect succeed, but it tries. The DNS finds both IPv4 and IPv6 addresses, and the IPv4 ones succeed, wasting your file descriptors (since you never close anything, as you intended), and then eventually you see the errors from the IPv6 failures. You're probably hitting a Happy Eyeballs dial where both IPv6 and IPv4 are tried, and normally your IPv6 fails but it's okay because the IPv4 takes over, but in the final case your IPv6 fails and the IPv4 can't help take over because it's out of fds.

In any case, you seem to be having network problems.

I don't think the error message matters too much.

But maybe @mikioh has other opinions.

@mikioh
Copy link
Contributor

mikioh commented Feb 11, 2016

See https://golang.org/doc/go1.5#minor_library_changes. As described in the release notes like "The net package will now Dial hostnames by trying each IP address in order until one succeeds", from Go 1.5 Dial and its siblings do more work known as "connect by name" when you pass a destination "name." Also it requirse more file descriptors than Go 1.4 and below.

Try the following snippet and variables:

  • ulimit -n 10; export GODEBUG=netdns=go
  • ulimit -n 10; export GODEBUG=netdns=cgo
package main

import (
        "fmt"
        "net"
)

func main() {
        for i := 0; ; i++ {
                c, err := net.Dial("tcp", "www.example.com:80")
                if err != nil {
                        fmt.Println(i, err)
                        break
                }
                fmt.Println(i, c.RemoteAddr())
                //c.Close()
        }
}

I suppose that you may have various errors depending on the circumstances. Perhaps there's a room for improvement--reporting error values in dialing with "connect by name."

@mikioh mikioh reopened this Feb 11, 2016
@mikioh mikioh changed the title http.Client.Do() error message regression net: vague error message from Dial("tcp", "DNS reg-name") Feb 11, 2016
@mikioh mikioh added this to the Unplanned milestone Feb 11, 2016
@mikioh
Copy link
Contributor

mikioh commented Feb 11, 2016

I'll keep this issue open for someone who has a nice solution.

@bradfitz
Copy link
Contributor

@mikioh, I'm confused. What would you improve? Which error values are we not reporting?

@mikioh
Copy link
Contributor

mikioh commented Feb 11, 2016

What would you improve?

That's the problem. When the connect-by-name Dial faces multiple error values, which should be preferred? Or should we return nested, multiple error values? FWIW, it looks like current implementation prefers an error on primary destinations to backup destinations.

@mikioh
Copy link
Contributor

mikioh commented Feb 11, 2016

@kirillrdy,

Just FYI. The following tweak does affect nothing to the IPv6 node capability test in the net package,

echo 1 > /proc/sys/net/ipv6/conf/wlp4s0/disable_ipv6

because the test is a call like Listen("tcp6", "[::1]:0") for simplicity. Please do

echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6

instead.

@kirillrdy
Copy link
Author

@mikioh Thanks for that
Yeah disabling IPv6 makes it return

dial tcp: lookup example.com: too many open files

So i guess my IPv6 was always not working, but when IPv4 fails ( due to too many opened files ) it returns error from IPv6 Dial

@mikioh
Copy link
Contributor

mikioh commented Jun 26, 2017

Merging this into #18183.

@mikioh mikioh closed this as completed Jun 26, 2017
@golang golang locked and limited conversation to collaborators Jun 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants