-
Notifications
You must be signed in to change notification settings - Fork 14
Frame pointer clobbered in some functions which have spills #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Let's compare the generated assembly code with GCC. First of all, if the handler (
|
If I change
then the generated assembly is still quite close (and there's no error message from
|
Things seem to turn weird when the original definition of
|
Thank you for a very complete test case! Could you share what compilation arguments you are using? I tend to always use |
I'm not really using any compilation arguments except
|
Hmmm playing around with |
Ha! Using
|
Cool! Now, as to "wrong". The error message is:
It seems hard to believe that we could have actually run out of RAM; there's just not that much code! Can you see why it's trying to read from a piece of memory that's outside of the valid area? My gut tells me that it has something to do with the fact that we aren't actually initializing the chip properly. I'm pretty sure if you look at GCC's assembly, there's some initializer code that sets up the stack pointer and friends. I thought I had some code that dealt with all that, but I appear to have never pushed it. |
And for my own curiosity, could you add the optimized assembly from GCC for the handler? Always good to see how that stacks up. |
I'm pretty sure the GCC output in #38 (comment) is the optimized version. |
That's cool; then they are the same (except the r0 / r1 preference) |
@gergoerdi: I see that you've used There's a similar problem in the AVR support of |
I'm using GCC for linking:
|
I've compiled the Rust code on my machine and linked it with the same command and it looks like it indeed does contain all the necesary runtime libraries and startup code. I can also see the stack pointer being initialised in both executables 6c: cf ef ldi r28, 0xFF ; 255
6e: d8 e0 ldi r29, 0x08 ; 8
70: de bf out 0x3e, r29 ; 62
72: cd bf out 0x3d, r28 ; 61 |
When I step through and set a breakpoint on
That seems interesting because with an SP that big, you'd need 8mb of RAM, and I was only simulating an atmega328p. I will debug the GCC generated executable and see the stack pointer when Also, I believe the actual error is hit on the very last line of
This would be the very first |
I may have been getting confused, it looks like SP always have |
I've verified that it is not any code in Here is the data flow ; load SP and subtract two from it
in r28, 61
in r29, 62
sbiw r28, 2
in r0, 63
; Store SP.
; At this point and afterwards, r29:r28 and SP is 0x08ea, which is perfect
cli
out 62, r29
out 63, r0
out 61, r28
in r24, 5
std Y+2, r24
ldd r24, Y+2
std Y+1, r24
rjmp LBB1_1
LBB1_1:
ldi r24, 32
ldd r25, Y+1
eor r25, r24
out 5, r25
rjmp LBB1_2
LBB1_2:
; This corrupts the r29:r28 register. Befor this, it stores the value of SP, which should
; be used in the `adiw` instruction below to adjust the stack.
;
; Instead of this, `0` gets popped off so that r29:r28 is 0 and our stack is corrupted below when we subtract two and put it back in SP
pop r29
pop r28
pop r25
pop r24
pop r0
out 63, r0
adiw r28, 2
; Stack write begin (expanded from 'SPWRITE' node)
in r0, 63
cli
; r29 is '32', r0 is '224', r28 is '34'
; These values are mucked up because the weird asm block corrupted our registers.
out 62, r29 # this sets the high byte to `0x20`
out 63, r0
out 61, r28 # this sets the low byte to `0x22`
; Stack write end
pop r1
pop r0
reti I have a feeling this is related to the frame pointer, and how it is placed in the EDIT: Added more annotations to the asm |
Here is an LLVM IR file that can reproduce the problem. ; ModuleID = 'interrupt.cgu-0.rs'
source_filename = "interrupt.cgu-0.rs"
target datalayout = "e-p:16:16:16-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-n8"
target triple = "avr-atmel-none"
@__rustc_debug_gdb_scripts_section__ = internal unnamed_addr constant [34 x i8] c"\01gdb_load_rust_pretty_printers.py\00", section ".debug_gdb_scripts", align 1
; Function Attrs: nounwind uwtable
define void @main() unnamed_addr #0 !dbg !5 {
start:
%tmp_ret2 = alloca i8
%tmp_ret1 = alloca i8
%tmp_ret = alloca i8
%0 = load volatile i8, i8* inttoptr (i16 36 to i8*), align 1, !dbg !10
store i8 %0, i8* %tmp_ret, !dbg !10
%1 = load i8, i8* %tmp_ret, !dbg !10
br label %bb1, !dbg !10
bb1: ; preds = %start
%2 = or i8 %1, 32, !dbg !10
store volatile i8 %2, i8* inttoptr (i16 36 to i8*), align 1, !dbg !10
br label %bb2, !dbg !10
bb2: ; preds = %bb1
%3 = load volatile i8, i8* inttoptr (i16 129 to i8*), align 1, !dbg !11
store i8 %3, i8* %tmp_ret1, !dbg !11
%4 = load i8, i8* %tmp_ret1, !dbg !11
br label %bb3, !dbg !11
bb3: ; preds = %bb2
%5 = or i8 %4, 13, !dbg !11
store volatile i8 %5, i8* inttoptr (i16 129 to i8*), align 1, !dbg !11
br label %bb4, !dbg !11
bb4: ; preds = %bb3
store volatile i16 -3036, i16* inttoptr (i16 136 to i16*), align 2, !dbg !12
br label %bb5, !dbg !12
bb5: ; preds = %bb4
%6 = load volatile i8, i8* inttoptr (i16 111 to i8*), align 1, !dbg !13
store i8 %6, i8* %tmp_ret2, !dbg !13
%7 = load i8, i8* %tmp_ret2, !dbg !13
br label %bb6, !dbg !13
bb6: ; preds = %bb5
%8 = or i8 %7, 2, !dbg !13
store volatile i8 %8, i8* inttoptr (i16 111 to i8*), align 1, !dbg !13
br label %bb7, !dbg !13
bb7: ; preds = %bb6
call void asm "SEI", ""(), !dbg !14, !srcloc !15
br label %bb8, !dbg !16
bb8: ; preds = %bb8, %bb7
br label %bb8, !dbg !16
}
; Function Attrs: nounwind uwtable
define avr_signalcc void @__vector_11() unnamed_addr #0 !dbg !17 {
start:
%tmp_ret = alloca i8
%0 = load volatile i8, i8* inttoptr (i16 37 to i8*), align 1, !dbg !18
store i8 %0, i8* %tmp_ret, !dbg !18
%1 = load i8, i8* %tmp_ret, !dbg !18
br label %bb1, !dbg !18
bb1: ; preds = %start
%2 = xor i8 %1, 32, !dbg !18
store volatile i8 %2, i8* inttoptr (i16 37 to i8*), align 1, !dbg !18
br label %bb2, !dbg !18
bb2: ; preds = %bb1
ret void, !dbg !19
}
attributes #0 = { nounwind uwtable "no-frame-pointer-elim"="true" }
!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2}
!0 = !{i32 1, !"PIE Level", i32 2}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !3, producer: "clang LLVM (rustc version 1.18.0-dev (f8982f5c8 2017-04-27))", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !4)
!3 = !DIFile(filename: "interrupt", directory: "/home/dylan/projects/builds/rust-lang/avr-rust")
!4 = !{}
!5 = distinct !DISubprogram(name: "main", linkageName: "_ZN9interrupt4mainE", scope: !7, file: !6, line: 87, type: !8, isLocal: false, isDefinition: true, scopeLine: 87, flags: DIFlagPrototyped, isOptimized: false, unit: !2, templateParams: !4, variables: !4)
!6 = !DIFile(filename: "/tmp/interrupt.rs", directory: "/home/dylan/projects/builds/rust-lang/avr-rust")
!7 = !DINamespace(name: "interrupt", scope: null, file: !6, line: 1)
!8 = !DISubroutineType(types: !9)
!9 = !{null}
!10 = !DILocation(line: 89, scope: !5)
!11 = !DILocation(line: 92, scope: !5)
!12 = !DILocation(line: 95, scope: !5)
!13 = !DILocation(line: 98, scope: !5)
!14 = !DILocation(line: 101, scope: !5)
!15 = !{i32 1}
!16 = !DILocation(line: 103, scope: !5)
!17 = distinct !DISubprogram(name: "__vector_11", linkageName: "_ZN9interrupt11__vector_11E", scope: !7, file: !6, line: 108, type: !8, isLocal: false, isDefinition: true, scopeLine: 108, flags: DIFlagPrototyped, isOptimized: false, unit: !2, templateParams: !4, variables: !4)
!18 = !DILocation(line: 109, scope: !17)
!19 = !DILocation(line: 110, scope: !17)
|
This is indeed being caused by the frame pointer. There is an This is probably caused by a hack I had to remove a while back in order to finish upstreaming the backend. @shepmaster might remember this - avr-llvm/llvm#145 It looks like we removed the code which actually rewound the register allocator (avr-llvm/llvm#226) but we left a place in the code where it assumes that the FP lives in There are two places I can see that are relevant
I will investigate to see if it's possible to just remove them, as we already decided that we could just stop using the FP. |
I'm not entirely sure what the deal is on avr-llvm/llvm#145 I make the claim
and the same day I say
I didn't explain why I changed my mind. Thinking about it now, I still don't see a way to have dynamically sized It looks like we're going to have to rebuild some kind of support for reserving the frame pointer. But at least we would've been forced to remove the hack anyway (it touched LLVM code common to all targets). |
I think I've got a solution - previously we'd set the FP if there were any spills or allocas. The reason we needed the rewinding trick is because we couldn't tell in advance if a given function has a spill. This meant that we had to hack the register allocator to work around it if we made a spill in a function. If we amend it so that we only use an FP if we have allocas. This is something we can tell in advance. |
It will be quite hard to write out a frame pointer, and I think the performance of the generated code would be worse. If we don't have a dedicated frame pointer, we'd need to scavenge a It looks like we really will have to teach LLVM to mark the frame pointer as reserved if there is a spill. I'm not entirely sure how to do it, and so I've written an email to In the meantime, we have a few different options.
|
I'd prefer not to go down this route, as it would give false hope that things are working when really they aren't.
This is more palatable as it's "just" a performance loss. If the change were easy, I'd love to apply it and see how far the compilation gets.
This seems best currently; you've already got some interesting suggestions on the thread.
An interesting alternative. If we have to do something, at least it would be contained somewhat. |
I'm keen on this too. It's very easy to enable it globally, and I think it's better that we generate correct code now and properly fix it later. I will add this to the |
When I enable the frame pointer for all functions, this specific problem is fixed. I've found another problem which causes Y to be clobbered however. The function emission code looks something like this
It looks like Y is automatically restored before the epilogue is generated. The epilogue assumes that 'Y' still contains SP, but it in actual fact contains its original value at the time the function was called. This should be an easy fix however. |
I will raise another bug for the other issue. I've just committed a patch to enable the FP for all functions to LLVM trunk |
Fix has been cherry picked from llvm-mirror/llvm@2cc0b07, everything is shiny |
I'm trying to add an interrupt handler to
TIMER1_COMPA
to flash PB5. Here's what I tried:simavr
reports that the generated executable goes wrong after running the handler twice:Standalone, full test case:
The text was updated successfully, but these errors were encountered: