Skip to content

runtime: bad frame pointer during panic during duffcopy #73748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nsrip-dd opened this issue May 16, 2025 · 2 comments
Open

runtime: bad frame pointer during panic during duffcopy #73748

nsrip-dd opened this issue May 16, 2025 · 2 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@nsrip-dd
Copy link
Contributor

nsrip-dd commented May 16, 2025

Go version

go version go1.23.8 linux/amd64

Output of go env in your module/workspace:

N/A

What did you do?

This is split off from #73664.

I ran the following program (Playground link):

package main

import (
	"context"
	"io"
	"runtime/trace"
)

type ints [32]int

func main() {
	trace.Start(io.Discard)
	defer trace.Stop()
	defer func() {
		recover()
		trace.Log(context.Background(), "a", "b")
	}()

	dereference(nil)
}

//go:noinline
func dereference(x *ints) ints {
	return *x
}

Instead of trace.Log, we could do some blocking with the block profiler enabled. The important thing is that we try to record a traceback using frame pointer unwinding during the panic.

What did you see happen?

Crashed during frame pointer unwinding on amd64:

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x45aaae]

goroutine 1 gp=0xc0000061c0 m=0 mp=0x535780 [running]:
runtime.throw({0x4a5e93?, 0x0?})
        /home/bits/sdk/go1.23.8/src/runtime/panic.go:1073 +0x48 fp=0xc00009c710 sp=0xc00009c6e0 pc=0x465588
runtime.sigpanic()
        /home/bits/sdk/go1.23.8/src/runtime/signal_unix.go:901 +0x3c9 fp=0xc00009c770 sp=0xc00009c710 pc=0x466849
runtime.fpTracebackPCs(...)
        /home/bits/sdk/go1.23.8/src/runtime/tracestack.go:258
runtime.traceStack(0x535360?, 0x1?, 0x1)
        /home/bits/sdk/go1.23.8/src/runtime/tracestack.go:93 +0x26e fp=0xc00009cbd8 sp=0xc00009c770 pc=0x45aaae
runtime.traceLocker.stack(...)
        /home/bits/sdk/go1.23.8/src/runtime/traceevent.go:176
runtime/trace.userLog(0x0, {0x4c1d28, 0x1}, {0x4c1130, 0x1})
        /home/bits/sdk/go1.23.8/src/runtime/traceruntime.go:699 +0x145 fp=0xc00009cc80 sp=0xc00009cbd8 pc=0x468c45
runtime/trace.Log({0x4c21e8?, 0x554260?}, {0x4c1d28, 0x1}, {0x4c1130, 0x1})
        /home/bits/sdk/go1.23.8/src/runtime/trace/annotation.go:97 +0x72 fp=0xc00009ccb8 sp=0xc00009cc80 pc=0x482872
main.main.func1()
        /home/bits/sandbox/go/panic-fp-crash/main.go:16 +0x45 fp=0xc00009ccf8 sp=0xc00009ccb8 pc=0x482d45
panic({0x48e700?, 0x52fdd0?})
        /home/bits/sdk/go1.23.8/src/runtime/panic.go:791 +0x132 fp=0xc00009cda8 sp=0xc00009ccf8 pc=0x465292
runtime.panicmem(...)
        /home/bits/sdk/go1.23.8/src/runtime/panic.go:262
runtime.sigpanic()
        /home/bits/sdk/go1.23.8/src/runtime/signal_unix.go:917 +0x359 fp=0xc00009ce08 sp=0xc00009cda8 pc=0x4667d9
runtime.duffcopy()
        /home/bits/sdk/go1.23.8/src/runtime/duff_amd64.s:347 +0x2a0 fp=0xc00009ce10 sp=0xc00009ce08 pc=0x46c940
main.dereference(_)
        /home/bits/sandbox/go/panic-fp-crash/main.go:24 +0x33 fp=0xc00009ce20 sp=0xc00009ce10 pc=0x482cf3
main.main()
        /home/bits/sandbox/go/panic-fp-crash/main.go:19 +0x72 fp=0xc00009cf50 sp=0xc00009ce20 pc=0x482c72
runtime.main()
        /home/bits/sdk/go1.23.8/src/runtime/proc.go:272 +0x28b fp=0xc00009cfe0 sp=0xc00009cf50 pc=0x43446b
runtime.goexit({})
        /home/bits/sdk/go1.23.8/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00009cfe8 sp=0xc00009cfe0 pc=0x46bfc1

On arm64, the program doesn't crash, but the stack trace for the trace.Log call is wrong:

M=3451837 P=1 G=1 Log Time=5014804548414976 Task=0 Category="a" Message="b"
Stack=
        main.main.func1 @ 0x951bf
                /home/ec2-user/sandbox/go/panic-fp-crash/main.go:16
        runtime.gopanic @ 0x76d53
                /home/ec2-user/sdk/go1.23.8/src/runtime/panic.go:791
        runtime.panicmem @ 0x7837f
                /home/ec2-user/sdk/go1.23.8/src/runtime/panic.go:262
        runtime.sigpanic @ 0x7834c
                /home/ec2-user/sdk/go1.23.8/src/runtime/signal_unix.go:917
        runtime.duffcopy @ 0x7e17f
                /home/ec2-user/sdk/go1.23.8/src/runtime/duff_arm64.s:217
        main.dereference @ 0x95147
                /home/ec2-user/sandbox/go/panic-fp-crash/main.go:24
        main.dereference @ 0x95147
                /home/ec2-user/sandbox/go/panic-fp-crash/main.go:24
        main.dereference @ 0x95147
                /home/ec2-user/sandbox/go/panic-fp-crash/main.go:24
        main.dereference @ 0x95147
                /home/ec2-user/sandbox/go/panic-fp-crash/main.go:24
        main.dereference @ 0x95147
                /home/ec2-user/sandbox/go/panic-fp-crash/main.go:24
        [ ... repeated > 100 times ... ]

Note that this doesn't fail or produce invalid results with optimizations disabled (-gcflags=-N)

What did you expect to see?

No crash. This also reproduces with Go 1.24.3.

I believe the issue is due to the way the frame pointer is set up around runtime.duffcopy calls. Since those calls jump into the middle of a function, the compiler sets up a pseudo-frame around the calls instead. On both amd64 and arm64, the compiler saves a frame pointer below the current stack frame without changing the stack pointer:

amd64 example:

  main.go:24            0x482ce4                48896c24f0              MOVQ BP, -0x10(SP)
  main.go:24            0x482ce9                488d6c24f0              LEAQ -0x10(SP), BP
  main.go:24            0x482cee                e84d9cfeff              CALL 0x46c940 // <-- this calls duffcopy
  main.go:24            0x482cf3                488b6d00                MOVQ 0(BP), BP

arm64 example:

  main.go:24            0x9513c                 a93eeffd                STP (R29, R27), -24(RSP)
  main.go:24            0x95140                 d10063fd                SUB $24, RSP, R29
  main.go:24            0x95144                 97ffa40f                CALL -23537(PC) // <-- this calls duffcopy
  main.go:24            0x95148                 d10023fd                SUB $8, RSP, R29

If the copy panics, for example due to a nil pointer dereference, then the panic function call is injected into the goroutine using its current stack. The panic call frame will clobber the frame pointer saved by duffcopy since it's below the current call frame. On arm64 we end up with a frame pointer loop and on amd64 we end up with junk.

https://go.dev/cl/672996 would fix this by getting rid of the code to save a frame pointer. Alternatively, we could make an actual stack frame, but that could be wasteful?

cc @randall77

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 16, 2025
@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label May 16, 2025
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 16, 2025
@mknyszek mknyszek added this to the Backlog milestone May 16, 2025
@mknyszek
Copy link
Contributor

CC @golang/runtime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants