Calculation of regunitmasks seems wrong #78942

kaz7 · 2024-01-22T06:04:51Z

Calculation of regunitmasks seems wrong after https://reviews.llvm.org/D157864.

Hi, @jayfoad and @qcolombet, VE is having following troubles after above patch. The existing regression tests are passed, but the calculation of livein and regunitmasks become wrong. I'll explain what are causing problems. Please consider what is the best way to solve this problem. (I add @jayfoad since I though you are foad on llvm phablicator. Sorry if it's not correct.)

Livein is calculated wrongly because of following code.

diff --git a/llvm/lib/CodeGen/MachineSink.cpp b/llvm/lib/CodeGen/MachineSink.cpp
index 8fda4c3e3e83..b4cbb93d758e 100644
--- a/llvm/lib/CodeGen/MachineSink.cpp
+++ b/llvm/lib/CodeGen/MachineSink.cpp
@@ -1704,10 +1704,9 @@ static void updateLiveIn(MachineInstr *MI, MachineBasicBlock *SuccBB,
   for (auto U : UsedOpsInCopy) {
     Register SrcReg = MI->getOperand(U).getReg();
     LaneBitmask Mask;
-    for (MCRegUnitMaskIterator S(SrcReg, TRI); S.isValid(); ++S) {
+    for (MCRegUnitMaskIterator S(SrcReg, TRI); S.isValid(); ++S)
       Mask |= (*S).second;
-    }
-    SuccBB->addLiveIn(SrcReg, Mask.any() ? Mask : LaneBitmask::getAll());
+    SuccBB->addLiveIn(SrcReg, Mask);
   }
   SuccBB->sortUniqueLiveIns();
 }

LaneBitmasks from all MCRegUnitMaskIterator of SrcReg sometimes becomes none() because of 2nd problem. As a result, the register is treated as not livein although it is alive. IMO, it's better to think a register is alive here. Since, even whole lanebitmasks don't match, a register is used somehow from the point of livein view.

computeRegUnitLaneMasks calculates different lanebitmask for each regunit.

diff --git a/llvm/utils/TableGen/CodeGenRegisters.cpp b/llvm/utils/TableGen/CodeGenRegisters.cpp
index 74fb80610b9c..b8dde6e946a4 100644
--- a/llvm/utils/TableGen/CodeGenRegisters.cpp
+++ b/llvm/utils/TableGen/CodeGenRegisters.cpp
@@ -2123,8 +2123,8 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
   for (auto &Register : Registers) {
     // Create an initial lane mask for all register units.
     const auto &RegUnits = Register.getRegUnits();
-    CodeGenRegister::RegUnitLaneMaskList
-        RegUnitLaneMasks(RegUnits.count(), LaneBitmask::getNone());
+    CodeGenRegister::RegUnitLaneMaskList RegUnitLaneMasks(
+        RegUnits.count(), LaneBitmask::getAll());
     // Iterate through SubRegisters.
     typedef CodeGenRegister::SubRegMap SubRegMap;
     const SubRegMap &SubRegs = Register.getSubRegs();
@@ -2143,7 +2143,7 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
         unsigned u = 0;
         for (unsigned RU : RegUnits) {
           if (SUI == RU) {
-            RegUnitLaneMasks[u] |= LaneMask;
+            RegUnitLaneMasks[u] &= LaneMask;
             assert(!Found);
             Found = true;
           }

I'm not sure this is correct or not. However, the result of calculation is different. For example, let's consider that we have two subreg with 0x0001 and 0x0002 lanemasks. original code works like this. 0x0001 | 0x0002 => 0x0003.

computeRegUnitLaneMasks 0, 0000000000000000 |= 0000000000000001 -> 0000000000000001
computeRegUnitLaneMasks 0, 0000000000000001 |= 0000000000000002 -> 0000000000000003

The modified code works like this. 0x0001 & 0x0002 => 0x0000.

computeRegUnitLaneMasks 0, FFFFFFFFFFFFFFFF &= 0000000000000001 -> 0000000000000001
computeRegUnitLaneMasks 0, 0000000000000001 &= 0000000000000002 -> 0000000000000000

This calculation causes none() LaneBitmask in the 1st problem.

I'm still investigating these problems. So, I'm not sure what is the best, just reverting previous patch or solving this problem. But, I know above patch is causing above problems as a fact. So, I just decide to let you guys know it. Thanks.

The text was updated successfully, but these errors were encountered:

jayfoad · 2024-01-22T10:22:05Z

Could you run this code in a debugger to work out exactly which register and subregisters are involved? I can't understand how you would end up with zero from this calculation. You would need:

a single register R
with two leaf subregisters S1 and S2
S1 and S2 share a regunit U (i.e. they overlap)
but the lanemasks for S1 and S2 are disjoint.

kaz7 · 2024-01-22T10:29:44Z

Thank you for checking this issue. Please give me a few days. I'll give you those information.

kaz7 · 2024-01-22T11:02:17Z

It didn't take a few days. This is answer to you.

You would need:

a single register R

For example, it is a register SX5 (EnumValue is 192).

with two leaf subregisters S1 and S2

They are SF5 and SW5.

S1 and S2 share a regunit U (i.e. they overlap)

Yes. SX5, 64 bit register, represents 64 bit float or integer value. SF5, high 32 bits, represents 32 bit float value. SW5, low 32 bits, represents 32 bit integer value.

but the lanemasks for S1 and S2 are disjoint.

Yes. Those are disjoint.

The source code of VE Register information is below.

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/VE/VERegisterInfo.td#L127

jayfoad · 2024-01-22T11:07:56Z

It didn't take a few days. This is answer to you.

You would need:

a single register R

For example, it is a register SX5 (EnumValue is 192).

with two leaf subregisters S1 and S2

They are SF5 and SW5.

S1 and S2 share a regunit U (i.e. they overlap)

Yes. SX5, 64 bit register, represents 64 bit float or integer value. SF5, high 32 bits, represents 32 bit float value. SW5, low 32 bits, represents 32 bit integer value.

but the lanemasks for S1 and S2 are disjoint.

Yes. Those are disjoint.

The source code of VE Register information is below.

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/VE/VERegisterInfo.td#L127

Thank you for the example, but why would SF5 and SW5 share a regunit U? If they do not share a regunit then RegUnitLaneMasks[u] &= LaneMask will not be executed twice for the same value of u.

kaz7 · 2024-01-22T11:23:31Z

Thank you for the example, but why would SF5 and SW5 share a regunit U? If they do not share a regunit then RegUnitLaneMasks[u] &= LaneMask will not be executed twice for the same value of u.

I'm not sure why u is identical, this case it is zero. Moreover, I'm not sure what u means unfortunately. So, let me just explain how it runs using the source code.

The subregs loop, L2151, runs over each subregs of SX5 register. This case, it runs over SF5 and SW5 as subregisters of SX5.

The &= expression, L2166, runs twice for both SF5 and SW5.

Before the subregs loop, L2151, RegUnitLaneMasks[0] is initialized by 0xFFFFFFFFFFFFFFFF.
In the first iteration for SF5, the UnitLanemask is 0x0000000000000001 and u is initialized by 0. RegUnitLaneMasks[u] is modified to 0x0000000000000001 as a result of 0xFFFFFFFFFFFFFFFF & 0x0000000000000001.
In the second iteration for SW5, the UnitLanemask is 0x0000000000000002 and u is again initialized by 0. RegUnitLaneMasks[u] is modified to 0x0000000000000000 as a result of 0x0000000000000001 & 0000000000000002

Does this expression make sense?

jayfoad · 2024-01-22T12:05:15Z

Can you try an experiment please?

diff --git a/llvm/lib/Target/VE/VERegisterInfo.td b/llvm/lib/Target/VE/VERegisterInfo.td
index fbe71d383e1e..529151ef9a87 100644
--- a/llvm/lib/Target/VE/VERegisterInfo.td
+++ b/llvm/lib/Target/VE/VERegisterInfo.td
@@ -105,10 +105,8 @@ foreach I = 0-63 in
   def SW#I : VEReg<I, "sw"#I, [], ["s"#I]>, DwarfRegNum<[I]>;
 
 // Generic floating point registers - 32 bits wide
-//   NOTE: Mark SF#I as alias of SW#I temporary to avoid register allocation
-//         problem.
 foreach I = 0-63 in
-  def SF#I : VEReg<I, "sf"#I, [], ["s"#I], [!cast<VEReg>("SW"#I)]>,
+  def SF#I : VEReg<I, "sf"#I, [], ["s"#I]>,
                    DwarfRegNum<[I]>;
 
 // Generic integer registers - 64 bits wide

jayfoad · 2024-01-22T12:08:12Z

To explain: the use of Aliases ("ad hoc aliasing") changes the behaviour of CodeGenRegister::computeSubRegs in ways that I am not too familiar with.

kaz7 · 2024-01-22T13:13:22Z

Thank you for that experiment. Applying your patch fixes both problems I mentioned in this ticket.

I'm also not familiar with Aliases behavior. So, I cannot say how it should be...

At least, I think the alias above should be removed if the register allocation problem is solved for VE. I'll check VE's behavior after applying this expreiment modification. But it is out of the scope of this ticket little bit. I appreciate your help. Thank you so much.

Remove SF# register's aliases in order to solve a problem caused by D157864. llvm/llvm-project#78942

kaz7 · 2024-01-23T02:09:35Z

I remember the allocation problem. Like I described, SF5 and SW5 shares an identical SX5 register. Both are disjoint. However, many instructions contaminate other part. So, in fact, it is impossible to use SF5 and SW5 simultaneously. Therefore, I'd like to not use for example SW5 while SF5 is alive. I don't know how to let it know to the register allocators. Therefore, I used alias for that purpose. However, this trick is not useable after D157864.

This problem is still existing. So, I need to find the correct way to not use SF#N and SW#N simultaneously. I appreciate if you know such options by any chance. Thanks!

kaz7 · 2024-01-23T08:25:03Z

I cannot find any other options to force register allocator to not allocate SF and SW registers simultaneously except Aliases.

  // Aliases - A list of registers that this register overlaps with.  A read or
  // modification of this register can potentially read or modify the aliased
  // registers.
  list<Register> Aliases = [];

And the aliases' expression like A read or modification of this register can potentially read or modify the aliased register fits very well for the relationship between SF and SW registers.

jayfoad · 2024-01-23T11:23:56Z

Like I described, SF5 and SW5 shares an identical SX5 register. Both are disjoint. However, many instructions contaminate other part.

I see. To model that properly I guess you could make the instructions define an SX result and then extract the appropriate SF or SW subreg from it. But that sounds like a lot of work.

Maybe we can partly revert tablegen to the old behaviour. Can you try a patch like this, to see if it fixes your problem without breaking any other in-tree targets?

diff --git a/llvm/utils/TableGen/CodeGenRegisters.cpp b/llvm/utils/TableGen/CodeGenRegisters.cpp
index d1abdb74ea4a..b9d404aaee5f 100644
--- a/llvm/utils/TableGen/CodeGenRegisters.cpp
+++ b/llvm/utils/TableGen/CodeGenRegisters.cpp
@@ -2144,7 +2144,7 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
     // Create an initial lane mask for all register units.
     const auto &RegUnits = Register.getRegUnits();
     CodeGenRegister::RegUnitLaneMaskList RegUnitLaneMasks(
-        RegUnits.count(), LaneBitmask::getAll());
+        RegUnits.count(), LaneBitmask::getNone());
     // Iterate through SubRegisters.
     typedef CodeGenRegister::SubRegMap SubRegMap;
     const SubRegMap &SubRegs = Register.getSubRegs();
@@ -2163,7 +2163,7 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
         unsigned u = 0;
         for (unsigned RU : RegUnits) {
           if (SUI == RU) {
-            RegUnitLaneMasks[u] &= LaneMask;
+            RegUnitLaneMasks[u] |= LaneMask;
             assert(!Found);
             Found = true;
           }
@@ -2173,6 +2173,10 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
         assert(Found);
       }
     }
+    for (auto &Mask : RegUnitLaneMasks) {
+      if (Mask.none())
+        Mask = LaneBitmask::getAll();
+    }
     Register.setRegUnitLaneMasks(RegUnitLaneMasks);
   }
 }

kaz7 · 2024-01-23T13:39:31Z

Thank you for new partial reverting. I'll try that tomorrow.

Like I described, SF5 and SW5 shares an identical SX5 register. Both are disjoint. However, many instructions contaminate other part.

I see. To model that properly I guess you could make the instructions define an SX result and then extract the appropriate SF or SW subreg from it. But that sounds like a lot of work.

Yes. That's a lot of work. But thank you for modeling idea.

Change computeRegUnitLaneMasks to partially revert D157864 modification since this cause some problems on VE backend. llvm/llvm-project#78942 (comment)

kaz7 · 2024-01-24T00:53:15Z

This partial reverting works great. All regression tests including internal tests which caused errors previously are passed. I'm looking forward to seeing this patch in main branch. Thank you!!

When a whole register is added a basic block's liveins, use LaneBitmask::getAll for the live lanes instead of trying to calculate an accurate mask of the lanes that comprise the register. This simplifies the code and matches other places where a whole register is marked as livein. This also avoids problems when regunits that are synthesized by TableGen to represent ad hoc aliasing have a lane mask of 0. Fixes llvm#78942

When a whole register is added a basic block's liveins, use LaneBitmask::getAll for the live lanes instead of trying to calculate an accurate mask of the lanes that comprise the register. This simplifies the code and matches other places where a whole register is marked as livein. This also avoids problems when regunits that are synthesized by TableGen to represent ad hoc aliasing have a lane mask of 0. Fixes #78942

When a whole register is added a basic block's liveins, use LaneBitmask::getAll for the live lanes instead of trying to calculate an accurate mask of the lanes that comprise the register. This simplifies the code and matches other places where a whole register is marked as livein. This also avoids problems when regunits that are synthesized by TableGen to represent ad hoc aliasing have a lane mask of 0. Fixes llvm#78942

github-actions bot added the new issue label Jan 22, 2024

EugeneZelenko added llvm:codegen and removed new issue labels Jan 22, 2024

kaz7 added a commit to sx-aurora-dev/llvm-project that referenced this issue Jan 22, 2024

Remove SF# register's aliases

efa3e8e

Remove SF# register's aliases in order to solve a problem caused by D157864. llvm/llvm-project#78942

kaz7 added a commit to sx-aurora-dev/llvm-project that referenced this issue Jan 23, 2024

Change computeRegUnitLaneMasks

a99cceb

Change computeRegUnitLaneMasks to partially revert D157864 modification since this cause some problems on VE backend. llvm/llvm-project#78942 (comment)

This was referenced Jan 24, 2024

Change computeRegUnitLaneMasks sx-aurora-dev/llvm-project#286

Merged

develop has problem sx-aurora-dev/llvm-project#285

Closed

jayfoad mentioned this issue Jan 25, 2024

[TableGen] Fix computeRegUnitLaneMasks for ad hoc aliasing #79462

Closed

jayfoad mentioned this issue Jan 29, 2024

[CodeGen] Simplify updateLiveIn in MachineSink #79831

Merged

jayfoad closed this as completed in #79831 Feb 15, 2024

jayfoad mentioned this issue May 6, 2025

[VE][TableGen] Corrupted LaneBitmask representing regLiveUnits for few registerClasses #138552

Closed

vg0204 mentioned this issue May 7, 2025

[WIP][CodeGen] Modifying MBB's liveins representation as into regUnits #129847

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculation of regunitmasks seems wrong #78942

Calculation of regunitmasks seems wrong #78942

kaz7 commented Jan 22, 2024 •

edited

Loading

jayfoad commented Jan 22, 2024

kaz7 commented Jan 22, 2024

kaz7 commented Jan 22, 2024

jayfoad commented Jan 22, 2024

kaz7 commented Jan 22, 2024 •

edited

Loading

jayfoad commented Jan 22, 2024

jayfoad commented Jan 22, 2024

kaz7 commented Jan 22, 2024

kaz7 commented Jan 23, 2024

kaz7 commented Jan 23, 2024

jayfoad commented Jan 23, 2024

kaz7 commented Jan 23, 2024

kaz7 commented Jan 24, 2024

Calculation of regunitmasks seems wrong #78942

Calculation of regunitmasks seems wrong #78942

Comments

kaz7 commented Jan 22, 2024 • edited Loading

jayfoad commented Jan 22, 2024

kaz7 commented Jan 22, 2024

kaz7 commented Jan 22, 2024

jayfoad commented Jan 22, 2024

kaz7 commented Jan 22, 2024 • edited Loading

jayfoad commented Jan 22, 2024

jayfoad commented Jan 22, 2024

kaz7 commented Jan 22, 2024

kaz7 commented Jan 23, 2024

kaz7 commented Jan 23, 2024

jayfoad commented Jan 23, 2024

kaz7 commented Jan 23, 2024

kaz7 commented Jan 24, 2024

kaz7 commented Jan 22, 2024 •

edited

Loading

kaz7 commented Jan 22, 2024 •

edited

Loading