-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Calculation of regunitmasks seems wrong #78942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you run this code in a debugger to work out exactly which register and subregisters are involved? I can't understand how you would end up with zero from this calculation. You would need:
|
Thank you for checking this issue. Please give me a few days. I'll give you those information. |
It didn't take a few days. This is answer to you.
For example, it is a register SX5 (EnumValue is 192).
They are SF5 and SW5.
Yes. SX5, 64 bit register, represents 64 bit float or integer value. SF5, high 32 bits, represents 32 bit float value. SW5, low 32 bits, represents 32 bit integer value.
Yes. Those are disjoint. The source code of VE Register information is below. https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/VE/VERegisterInfo.td#L127 |
Thank you for the example, but why would SF5 and SW5 share a regunit U? If they do not share a regunit then |
I'm not sure why The subregs loop, L2151, runs over each subregs of SX5 register. This case, it runs over SF5 and SW5 as subregisters of SX5. The
Does this expression make sense? |
Can you try an experiment please? diff --git a/llvm/lib/Target/VE/VERegisterInfo.td b/llvm/lib/Target/VE/VERegisterInfo.td
index fbe71d383e1e..529151ef9a87 100644
--- a/llvm/lib/Target/VE/VERegisterInfo.td
+++ b/llvm/lib/Target/VE/VERegisterInfo.td
@@ -105,10 +105,8 @@ foreach I = 0-63 in
def SW#I : VEReg<I, "sw"#I, [], ["s"#I]>, DwarfRegNum<[I]>;
// Generic floating point registers - 32 bits wide
-// NOTE: Mark SF#I as alias of SW#I temporary to avoid register allocation
-// problem.
foreach I = 0-63 in
- def SF#I : VEReg<I, "sf"#I, [], ["s"#I], [!cast<VEReg>("SW"#I)]>,
+ def SF#I : VEReg<I, "sf"#I, [], ["s"#I]>,
DwarfRegNum<[I]>;
// Generic integer registers - 64 bits wide |
To explain: the use of |
Thank you for that experiment. Applying your patch fixes both problems I mentioned in this ticket. I'm also not familiar with At least, I think the alias above should be removed if the |
Remove SF# register's aliases in order to solve a problem caused by D157864. llvm/llvm-project#78942
I remember the This problem is still existing. So, I need to find the correct way to not use SF#N and SW#N simultaneously. I appreciate if you know such options by any chance. Thanks! |
I cannot find any other options to force register allocator to not allocate SF and SW registers simultaneously except
And the aliases' expression like |
I see. To model that properly I guess you could make the instructions define an SX result and then extract the appropriate SF or SW subreg from it. But that sounds like a lot of work. Maybe we can partly revert tablegen to the old behaviour. Can you try a patch like this, to see if it fixes your problem without breaking any other in-tree targets? diff --git a/llvm/utils/TableGen/CodeGenRegisters.cpp b/llvm/utils/TableGen/CodeGenRegisters.cpp
index d1abdb74ea4a..b9d404aaee5f 100644
--- a/llvm/utils/TableGen/CodeGenRegisters.cpp
+++ b/llvm/utils/TableGen/CodeGenRegisters.cpp
@@ -2144,7 +2144,7 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
// Create an initial lane mask for all register units.
const auto &RegUnits = Register.getRegUnits();
CodeGenRegister::RegUnitLaneMaskList RegUnitLaneMasks(
- RegUnits.count(), LaneBitmask::getAll());
+ RegUnits.count(), LaneBitmask::getNone());
// Iterate through SubRegisters.
typedef CodeGenRegister::SubRegMap SubRegMap;
const SubRegMap &SubRegs = Register.getSubRegs();
@@ -2163,7 +2163,7 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
unsigned u = 0;
for (unsigned RU : RegUnits) {
if (SUI == RU) {
- RegUnitLaneMasks[u] &= LaneMask;
+ RegUnitLaneMasks[u] |= LaneMask;
assert(!Found);
Found = true;
}
@@ -2173,6 +2173,10 @@ void CodeGenRegBank::computeRegUnitLaneMasks() {
assert(Found);
}
}
+ for (auto &Mask : RegUnitLaneMasks) {
+ if (Mask.none())
+ Mask = LaneBitmask::getAll();
+ }
Register.setRegUnitLaneMasks(RegUnitLaneMasks);
}
} |
Thank you for new partial reverting. I'll try that tomorrow.
Yes. That's a lot of work. But thank you for modeling idea. |
Change computeRegUnitLaneMasks to partially revert D157864 modification since this cause some problems on VE backend. llvm/llvm-project#78942 (comment)
This partial reverting works great. All regression tests including internal tests which caused errors previously are passed. I'm looking forward to seeing this patch in main branch. Thank you!! |
When a whole register is added a basic block's liveins, use LaneBitmask::getAll for the live lanes instead of trying to calculate an accurate mask of the lanes that comprise the register. This simplifies the code and matches other places where a whole register is marked as livein. This also avoids problems when regunits that are synthesized by TableGen to represent ad hoc aliasing have a lane mask of 0. Fixes llvm#78942
When a whole register is added a basic block's liveins, use LaneBitmask::getAll for the live lanes instead of trying to calculate an accurate mask of the lanes that comprise the register. This simplifies the code and matches other places where a whole register is marked as livein. This also avoids problems when regunits that are synthesized by TableGen to represent ad hoc aliasing have a lane mask of 0. Fixes #78942
When a whole register is added a basic block's liveins, use LaneBitmask::getAll for the live lanes instead of trying to calculate an accurate mask of the lanes that comprise the register. This simplifies the code and matches other places where a whole register is marked as livein. This also avoids problems when regunits that are synthesized by TableGen to represent ad hoc aliasing have a lane mask of 0. Fixes llvm#78942
When a whole register is added a basic block's liveins, use LaneBitmask::getAll for the live lanes instead of trying to calculate an accurate mask of the lanes that comprise the register. This simplifies the code and matches other places where a whole register is marked as livein. This also avoids problems when regunits that are synthesized by TableGen to represent ad hoc aliasing have a lane mask of 0. Fixes llvm#78942
Calculation of regunitmasks seems wrong after https://reviews.llvm.org/D157864.
Hi, @jayfoad and @qcolombet, VE is having following troubles after above patch. The existing regression tests are passed, but the calculation of livein and regunitmasks become wrong. I'll explain what are causing problems. Please consider what is the best way to solve this problem. (I add @jayfoad since I though you are foad on llvm phablicator. Sorry if it's not correct.)
I'm not sure this is correct or not. However, the result of calculation is different. For example, let's consider that we have two subreg with 0x0001 and 0x0002 lanemasks. original code works like this. 0x0001 | 0x0002 => 0x0003.
The modified code works like this. 0x0001 & 0x0002 => 0x0000.
This calculation causes none() LaneBitmask in the 1st problem.
I'm still investigating these problems. So, I'm not sure what is the best, just reverting previous patch or solving this problem. But, I know above patch is causing above problems as a fact. So, I just decide to let you guys know it. Thanks.
The text was updated successfully, but these errors were encountered: