Skip to content

Invalid assembly generated: "Expected either Y or Z register" #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gergoerdi opened this issue May 14, 2017 · 19 comments
Closed

Invalid assembly generated: "Expected either Y or Z register" #58

gergoerdi opened this issue May 14, 2017 · 19 comments
Labels
A-llvm Affects the LLVM AVR backend has-reduced-testcase A small LLVM IR file exists that demonstrates the problem

Comments

@gergoerdi
Copy link
Collaborator

gergoerdi commented May 14, 2017

target triple = "avr-atmel-none"

%"sync::atomic::AtomicI16" = type { %"cell::UnsafeCell<i16>", [0 x i8] }
%"cell::UnsafeCell<i16>" = type { i16, [0 x i8] }

define i8 @"f"(%"sync::atomic::AtomicI16"* nocapture readonly dereferenceable(2)) unnamed_addr #3 {
start:
  %1 = getelementptr inbounds %"sync::atomic::AtomicI16", %"sync::atomic::AtomicI16"* %0, i16 0, i32 0, i32 0
  %2 = load atomic i16, i16* %1 seq_cst, align 2
  ret i8 0
}

Compiling this into .s results in

target triple = "avr-atmel-none"

%"sync::atomic::AtomicI16" = type { %"cell::UnsafeCell<i16>", [0 x i8] }
%"cell::UnsafeCell<i16>" = type { i16, [0 x i8] }

define i8 @"f"(%"sync::atomic::AtomicI16"* nocapture readonly dereferenceable(2)) unnamed_addr #3 {
start:
  %1 = getelementptr inbounds %"sync::atomic::AtomicI16", %"sync::atomic::AtomicI16"* %0, i16 0, i32 0, i32 0
  %2 = load atomic i16, i16* %1 seq_cst, align 2
  ret i8 0
}

but turns out this is invalid (I guess the ld or the ldd with X?); and so trying to generate machine code instead fails:

Expected either Y or Z register
UNREACHABLE executed at /home/cactus/prog/rust/rust-avr/rust/src/llvm/lib/Target/AVR/MCTargetDesc/AVRMCCodeEmitter.cpp:145!
#0 0x0000000001721085 llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x1721085)
#1 0x000000000171f09e llvm::sys::RunSignalHandlers() (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x171f09e)
#2 0x000000000171f202 SignalHandler(int) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x171f202)
#3 0x00007f6cda826330 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10330)
#4 0x00007f6cd99e2c37 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x36c37)
#5 0x00007f6cd99e6028 abort (/lib/x86_64-linux-gnu/libc.so.6+0x3a028)
#6 0x00000000016d4825 (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x16d4825)
#7 0x0000000000ea0730 llvm::AVRMCCodeEmitter::encodeMemri(llvm::MCInst const&, unsigned int, llvm::SmallVectorImpl<llvm::MCFixup>&, llvm::MCSubtargetInfo const&) const (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0xea0730)
#8 0x0000000000ea17e3 llvm::AVRMCCodeEmitter::getBinaryCodeForInstr(llvm::MCInst const&, llvm::SmallVectorImpl<llvm::MCFixup>&, llvm::MCSubtargetInfo const&) const (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0xea17e3)
#9 0x0000000000ea1e7e llvm::AVRMCCodeEmitter::encodeInstruction(llvm::MCInst const&, llvm::raw_ostream&, llvm::SmallVectorImpl<llvm::MCFixup>&, llvm::MCSubtargetInfo const&) const (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0xea1e7e)
#10 0x000000000140f2fa llvm::MCELFStreamer::EmitInstToData(llvm::MCInst const&, llvm::MCSubtargetInfo const&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x140f2fa)
#11 0x0000000001421762 llvm::MCObjectStreamer::EmitInstruction(llvm::MCInst const&, llvm::MCSubtargetInfo const&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x1421762)
#12 0x0000000000e7481a llvm::AVRAsmPrinter::EmitInstruction(llvm::MachineInstr const*) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0xe7481a)
#13 0x0000000000f9ab79 llvm::AsmPrinter::EmitFunctionBody() (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0xf9ab79)
#14 0x0000000000cdb371 llvm::AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0xcdb371)
#15 0x00000000010e8523 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x10e8523)
#16 0x000000000138ba03 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x138ba03)
#17 0x000000000138baac llvm::FPPassManager::runOnModule(llvm::Module&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x138baac)
#18 0x000000000138c80f llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x138c80f)
#19 0x0000000000697f74 compileModule(char**, llvm::LLVMContext&) (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x697f74)
#20 0x0000000000646a90 main (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x646a90)
#21 0x00007f6cd99cdf45 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f45)
#22 0x000000000068d2b5 _start (/home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc+0x68d2b5)
Stack dump:
0.	Program arguments: /home/cactus/prog/rust/rust-avr/build/build/x86_64-unknown-linux-gnu/llvm/bin/llc a.ll --filetype=obj 
1.	Running pass 'Function Pass Manager' on module 'a.ll'.
2.	Running pass 'AVR Assembly Printer' on function '@f'
@gergoerdi
Copy link
Collaborator Author

Related: avr-llvm/llvm#213

@shepmaster
Copy link
Member

Compiling this into .s results in

Looks like you pasted the LLVM IR again.

@shepmaster shepmaster added A-llvm Affects the LLVM AVR backend has-reduced-testcase A small LLVM IR file exists that demonstrates the problem labels May 14, 2017
@gergoerdi
Copy link
Collaborator Author

Oops, sorry. The correct asm is:

	.text
	.file	"a.ll"
	.globl	f
	.p2align	1
	.type	f,@function
f:                                      ; @f
; BB#0:                                 ; %start
	mov	r26, r24
	mov	r27, r25
	in	r0, 63
	cli
	ld	r24, X
	ldd	r25, X+1
	out	63, r0
	ldi	r24, 0
	ret
.Lfunc_end0:
	.size	f, .Lfunc_end0-f

@shepmaster
Copy link
Member

I guess the ld or the ldd with X?

Yes, ldd only works with Y or Z.

@gergoerdi
Copy link
Collaborator Author

Actually, isn't the real bug that it is using ldd in the elaboration of LDWRdPtr for the high byte? Shouldn't it be expanded to ld r24, X; ld r25, X+1?

@shepmaster
Copy link
Member

Shouldn't it be expanded to ld r24, X; ld r25, X+1?

It could not because ld doesn't support "displacement":

LD – Load Indirect from Data Space to Register using Index X
Loads one byte indirect from the data space to a register.

LD (LDD) – Load Indirect from Data Space to Register using Index Y
Loads one byte indirect with or without displacement from the data space to a register.

LD (LDD) – Load Indirect From Data Space to Register using Index Z
Loads one byte indirect with or without displacement from the data space to a register.

That being said...

The X-pointer Register can either be left unchanged by the operation, or it can be post-incremented or pre- decremented.

We should be able to make use of that to change this

	ld	r24, X
	ldd	r25, X+1

into (pseudo ASM)

	ld	r24, X++
	ld	r25, X

Which should reduce the number of usages of that register class

@gergoerdi
Copy link
Collaborator Author

Yes, I meant ld X+ followed by ld.

Unfortunately, changing the expansion to that still fails, but with a different error message now: in

  auto MIBLO = buildMI(MBB, MBBI, AVR::LDRdPtrPi)
    .addReg(CurDstLoReg, RegState::Define)
    .addReg(SrcReg);

(with LDWRdPtr changed to take an LDSTPtrReg as source)

this fails with

llc: CodeGen/MachineOperand.h:407: void llvm::MachineOperand::setIsEarlyClobber(bool): Assertion `isReg() && IsDef && "Wrong MachineOperand accessor"' failed.

Stack dump:
0.	Program arguments: llc --mcpu=atmega328p a.ll --debug 
1.	Running pass 'Function Pass Manager' on module 'a.ll'.
2.	Running pass 'AVR pseudo instruction expansion pass' on function '@f'

@dylanmckay
Copy link
Member

llc: CodeGen/MachineOperand.h:407: void llvm::MachineOperand::setIsEarlyClobber(bool): Assertion `isReg() && IsDef && "Wrong MachineOperand accessor"' failed.

Stack dump:
0.	Program arguments: llc --mcpu=atmega328p a.ll --debug 
1.	Running pass 'Function Pass Manager' on module 'a.ll'.
2.	Running pass 'AVR pseudo instruction expansion pass' on function '@f'

I've seen this number of times, it normally happens when you change something on a pseudo instruction.

It looks like I've got a similar bug which I've fixed at avr-llvm/llvm#37, which was fixed in 26896ec4. In that case, there was just a typo.

@gergoerdi
Copy link
Collaborator Author

You're right, I can fix this by removing the @earlyclobber $base_wb annotation from LDRdPtrPi which I don't think we need since $base_wb is an input port anyway. But given my LLVM noob state, can you please confirm that this won't break something else?

@gergoerdi
Copy link
Collaborator Author

OK, the problem then is that now I have the following in AVRInstrInfo.td:

let mayLoad = 1,
hasSideEffects = 0,
Constraints = "$base_wb = $ptrreg,@earlyclobber $reg" in
{
  def LDRdPtrPi : FSTLD<0,
                        0b01,
                        (outs GPR8:$reg, PTRREGS:$base_wb),
                        (ins LDSTPtrReg:$ptrreg),
                        "ld\t$reg, $ptrreg+",
                        []>,
                  Requires<[HasSRAM]>;

but that $base_wb screws up the indexing of operands in the generated AVRGenMCCodeEmitter.inc:

    case AVR::LDRdPtrPi: {
      // op: ptrreg
      op = encodeLDSTPtrReg(MI, 2, Fixups, STI);
      Value |= (op & UINT64_C(3)) << 2;
      // op: reg
      op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
      Value |= (op & UINT64_C(31)) << 4;
      Value = loadStorePostEncoder(MI, Value, STI);
      break;
    }

The problematic line is, of course, this one:

      op = encodeLDSTPtrReg(MI, 2, Fixups, STI);

since MI will only have two operands (reg and ptrreg).

I understand that $base_wb is an output because X is changed by this instruction, and it's also not a real degree of freedom, so it's not something to be stored as an extra operand; but something is clearly wrong here.

Do we have any other, working examples of instructions that change one of their arguments as a side-effect?

@gergoerdi
Copy link
Collaborator Author

I'm still struggling with this; now because I haven't figured out yet how to tell in the definition of LDWRdPtr that it screws up Z as a side-effect.

@dylanmckay
Copy link
Member

I think in this case we may actually need @earlyclobber base_wb.

In this case, both base_wb and ptrreg are inputs and outputs. The instruction itself only has two operands - a gpr register (which is only written) and a pointer register (which is both read and written).

In LLVM, in order to have an operand which is both input AND output, you define an input and you define a separate output. Then you add let Constraints = "$base_wb = $ptrreg so that LLVM knows that they must be the same.

We also must add let Constraints = "earlyclobber @base_wb" in order to tell LLVM that the register that gets allocated to $base_wb cannot appear in more than one place in the instruction. This is how we encode the constraint that you cannot have the same source and destination.

Obviously that because this pseudo instruction increments the Z pointer register before it does the second 8-bit load, we cannot have the same source and destination, so we must use earlyclobber.

I'm still struggling with this; now because I haven't figured out yet how to tell in the definition of LDWRdPtr that it screws up Z as a side-effect

If you clobber Z, all you should need to do is define Z as a value that the instruction defines.

'Z' is really just an alias of R31R30, you should be able to just write let Defs = [R31R30].

You can see the definition of the des instruction (which clobbers half of the GPRs at once)

let Defs = [R15, R14, R13, R12, R11, R10, R9,
    R8, R7, R6, R5, R4, R3, R2, R1, R0] in
def DESK : FDES<(outs),
                (ins i8imm:$k),
                "des\t$k",
                []>,
           Requires<[HasDES]>;

@gergoerdi
Copy link
Collaborator Author

OK so I managed to solve the fundamental problem of LDW changing X as a side-effect; and I can now emit LDW during instruction selection (in AVRISelDAGToDAG) matching a 16-bit load, and then expand that LDW into LD X++; LD during pseudo-instruction expansion.

However, the same approach doesn't seem to work for AtomicLoad16, since that seems to be its own pseudo-instruction, and turning that into an LDW during expansion fails.

The reason for that is that LDW has three results: the register being written into, the modified pointer (double-)register, and the chain. But Load and AtomicLoad16 only have two results: the register being written into and the chain. So during ISelDAGToDAG there's a bit of re-shuffling being done, which I don't know yet how to do (or if it's even possible) during pseudoinstruction expansion.

@gergoerdi
Copy link
Collaborator Author

I think avr-rust/llvm@331ded7 is safe for upstreaming on its own, even though it doesn't solve the Atomic*16 problem. That one will need to build on top of the fixed LDW instruction.

@dylanmckay
Copy link
Member

Just took a peek at it and it looks good to me

Added a personal note to upstream this in the next few days

@gergoerdi
Copy link
Collaborator Author

I should have held my horses -- I am now seeing CHIP-8 diverging at some point with the LDW fix :( Don't upstream it just yet, until I've had time to debug what's going on.

@dylanmckay
Copy link
Member

That's algood, no rush :)

@dylanmckay
Copy link
Member

It looks like this problem is caused because the AtomicLoad16 instruction inherits from AtomicLoad, which is defined to always use PTRREGS (X,Y,Z) as the pointer.

The problem is that with 16-bit loads, we generate a LDWRdPtr instruction, which requires PTRDISPREGS (X,Y) because the postincrement instructions don't support X.

I will adjust the AtomicLoad class so that it takes a pointer register class as an a type parameter. I believe the same problem exists for 16-bit stores as well.

@dylanmckay
Copy link
Member

Got a fix, upstreaming and cherry-picking now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-llvm Affects the LLVM AVR backend has-reduced-testcase A small LLVM IR file exists that demonstrates the problem
Projects
None yet
Development

No branches or pull requests

3 participants