Skip to content

[NVPTX] support switch statement with brx.idx #102400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 8, 2024

Conversation

AlexMaclean
Copy link
Member

Add custom lowering for BR_JT DAG nodes to the brx.idx PTX instruction (PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx). Depending on the heuristics in DAG selection, switch statements may now be lowered using brx.idx

@AlexMaclean AlexMaclean requested a review from Artem-B August 7, 2024 22:44
@AlexMaclean AlexMaclean self-assigned this Aug 7, 2024
@llvmbot llvmbot added backend:NVPTX llvm:SelectionDAG SelectionDAGISel as well labels Aug 7, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2024

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-nvptx

Author: Alex MacLean (AlexMaclean)

Changes

Add custom lowering for BR_JT DAG nodes to the brx.idx PTX instruction (PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx). Depending on the heuristics in DAG selection, switch statements may now be lowered using brx.idx


Full diff: https://github.com/llvm/llvm-project/pull/102400.diff

6 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+6-5)
  • (modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp (+42-3)
  • (modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.h (+10)
  • (modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.td (+38)
  • (added) llvm/test/CodeGen/NVPTX/jump-table.ll (+69)
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 9ccdbab008aec8..5b2214fa66c40b 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3843,6 +3843,10 @@ class TargetLowering : public TargetLoweringBase {
   /// returned value is a member of the MachineJumpTableInfo::JTEntryKind enum.
   virtual unsigned getJumpTableEncoding() const;
 
+  virtual MVT getJumpTableRegTy(const DataLayout &DL) const {
+    return getPointerTy(DL);
+  }
+
   virtual const MCExpr *
   LowerCustomJumpTableEntry(const MachineJumpTableInfo * /*MJTI*/,
                             const MachineBasicBlock * /*MBB*/, unsigned /*uid*/,
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 9d617c7acd13c2..192fbf74b02dc0 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2977,7 +2977,7 @@ void SelectionDAGBuilder::visitJumpTable(SwitchCG::JumpTable &JT) {
   // Emit the code for the jump table
   assert(JT.SL && "Should set SDLoc for SelectionDAG!");
   assert(JT.Reg != -1U && "Should lower JT Header first!");
-  EVT PTy = DAG.getTargetLoweringInfo().getPointerTy(DAG.getDataLayout());
+  EVT PTy = DAG.getTargetLoweringInfo().getJumpTableRegTy(DAG.getDataLayout());
   SDValue Index = DAG.getCopyFromReg(getControlRoot(), *JT.SL, JT.Reg, PTy);
   SDValue Table = DAG.getJumpTable(JT.JTI, PTy);
   SDValue BrJumpTable = DAG.getNode(ISD::BR_JT, *JT.SL, MVT::Other,
@@ -3005,12 +3005,13 @@ void SelectionDAGBuilder::visitJumpTableHeader(SwitchCG::JumpTable &JT,
   // This value may be smaller or larger than the target's pointer type, and
   // therefore require extension or truncating.
   const TargetLowering &TLI = DAG.getTargetLoweringInfo();
-  SwitchOp = DAG.getZExtOrTrunc(Sub, dl, TLI.getPointerTy(DAG.getDataLayout()));
+  SwitchOp =
+      DAG.getZExtOrTrunc(Sub, dl, TLI.getJumpTableRegTy(DAG.getDataLayout()));
 
   unsigned JumpTableReg =
-      FuncInfo.CreateReg(TLI.getPointerTy(DAG.getDataLayout()));
-  SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), dl,
-                                    JumpTableReg, SwitchOp);
+      FuncInfo.CreateReg(TLI.getJumpTableRegTy(DAG.getDataLayout()));
+  SDValue CopyTo =
+      DAG.getCopyToReg(getControlRoot(), dl, JumpTableReg, SwitchOp);
   JT.Reg = JumpTableReg;
 
   if (!JTH.FallthroughUnreachable) {
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 516fc7339a4bf3..bf647c88f00e28 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -25,6 +25,7 @@
 #include "llvm/CodeGen/Analysis.h"
 #include "llvm/CodeGen/ISDOpcodes.h"
 #include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineJumpTableInfo.h"
 #include "llvm/CodeGen/MachineMemOperand.h"
 #include "llvm/CodeGen/SelectionDAG.h"
 #include "llvm/CodeGen/SelectionDAGNodes.h"
@@ -582,9 +583,7 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
   setOperationAction(ISD::ROTR, MVT::i8, Expand);
   setOperationAction(ISD::BSWAP, MVT::i16, Expand);
 
-  // Indirect branch is not supported.
-  // This also disables Jump Table creation.
-  setOperationAction(ISD::BR_JT, MVT::Other, Expand);
+  setOperationAction(ISD::BR_JT, MVT::Other, Custom);
   setOperationAction(ISD::BRIND, MVT::Other, Expand);
 
   setOperationAction(ISD::GlobalAddress, MVT::i32, Custom);
@@ -945,6 +944,9 @@ const char *NVPTXTargetLowering::getTargetNodeName(unsigned Opcode) const {
     MAKE_CASE(NVPTXISD::Dummy)
     MAKE_CASE(NVPTXISD::MUL_WIDE_SIGNED)
     MAKE_CASE(NVPTXISD::MUL_WIDE_UNSIGNED)
+    MAKE_CASE(NVPTXISD::BrxEnd)
+    MAKE_CASE(NVPTXISD::BrxItem)
+    MAKE_CASE(NVPTXISD::BrxStart)
     MAKE_CASE(NVPTXISD::Tex1DFloatS32)
     MAKE_CASE(NVPTXISD::Tex1DFloatFloat)
     MAKE_CASE(NVPTXISD::Tex1DFloatFloatLevel)
@@ -2785,6 +2787,8 @@ NVPTXTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
     return LowerFP_ROUND(Op, DAG);
   case ISD::FP_EXTEND:
     return LowerFP_EXTEND(Op, DAG);
+  case ISD::BR_JT:
+    return LowerBR_JT(Op, DAG);
   case ISD::VAARG:
     return LowerVAARG(Op, DAG);
   case ISD::VASTART:
@@ -2810,6 +2814,41 @@ NVPTXTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
   }
 }
 
+SDValue NVPTXTargetLowering::LowerBR_JT(SDValue Op, SelectionDAG &DAG) const {
+  SDLoc DL(Op);
+  SDValue Chain = Op.getOperand(0);
+  const auto *JT = cast<JumpTableSDNode>(Op.getOperand(1));
+  SDValue Index = Op.getOperand(2);
+
+  unsigned JId = JT->getIndex();
+  MachineJumpTableInfo *MJTI = DAG.getMachineFunction().getJumpTableInfo();
+  ArrayRef<MachineBasicBlock *> MBBs = MJTI->getJumpTables()[JId].MBBs;
+
+  SDValue IdV = DAG.getConstant(JId, DL, MVT::i32);
+
+  // Generate BrxStart node
+  SDVTList VTs = DAG.getVTList(MVT::Other, MVT::Glue);
+  Chain = DAG.getNode(NVPTXISD::BrxStart, DL, VTs, Chain, IdV);
+
+  // Generate BrxItem nodes
+  assert(!MBBs.empty());
+  for (MachineBasicBlock *MBB : MBBs.drop_back())
+    Chain = DAG.getNode(NVPTXISD::BrxItem, DL, VTs, Chain.getValue(0),
+                        DAG.getBasicBlock(MBB), Chain.getValue(1));
+
+  // Generate BrxEnd nodes
+  SDValue EndOps[] = {Chain.getValue(0), DAG.getBasicBlock(MBBs.back()), Index,
+                      IdV, Chain.getValue(1)};
+  SDValue BrxEnd = DAG.getNode(NVPTXISD::BrxEnd, DL, VTs, EndOps);
+
+  return BrxEnd;
+}
+
+// This will prevent AsmPrinter from trying to print the jump tables itself.
+unsigned NVPTXTargetLowering::getJumpTableEncoding() const {
+  return MachineJumpTableInfo::EK_Inline;
+}
+
 // This function is almost a copy of SelectionDAG::expandVAArg().
 // The only diff is that this one produces loads from local address space.
 SDValue NVPTXTargetLowering::LowerVAARG(SDValue Op, SelectionDAG &DAG) const {
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
index 63262961b363ed..32e6b044b0de1f 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
@@ -62,6 +62,9 @@ enum NodeType : unsigned {
   BFI,
   PRMT,
   DYNAMIC_STACKALLOC,
+  BrxStart,
+  BrxItem,
+  BrxEnd,
   Dummy,
 
   LoadV2 = ISD::FIRST_TARGET_MEMORY_OPCODE,
@@ -580,6 +583,11 @@ class NVPTXTargetLowering : public TargetLowering {
     return true;
   }
 
+  // The default is the same as pointer type, but brx.idx only accepts i32
+  MVT getJumpTableRegTy(const DataLayout &) const override { return MVT::i32; }
+
+  unsigned getJumpTableEncoding() const override;
+
   bool enableAggressiveFMAFusion(EVT VT) const override { return true; }
 
   // The default is to transform llvm.ctlz(x, false) (where false indicates that
@@ -637,6 +645,8 @@ class NVPTXTargetLowering : public TargetLowering {
 
   SDValue LowerSelect(SDValue Op, SelectionDAG &DAG) const;
 
+  SDValue LowerBR_JT(SDValue Op, SelectionDAG &DAG) const;
+
   SDValue LowerVAARG(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerVASTART(SDValue Op, SelectionDAG &DAG) const;
 
diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index 6a096fa5acea7c..cec7f20255d352 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -3880,6 +3880,44 @@ def DYNAMIC_STACKALLOC64 :
             [(set Int64Regs:$ptr, (dyn_alloca Int64Regs:$size, (i32 timm:$align)))]>,
             Requires<[hasPTX<73>, hasSM<52>]>;
 
+
+//
+// BRX
+//
+
+def SDTBrxStartProfile : SDTypeProfile<0, 1, [SDTCisInt<0>]>;
+def SDTBrxItemProfile : SDTypeProfile<0, 1, [SDTCisVT<0, OtherVT>]>;
+def SDTBrxEndProfile : SDTypeProfile<0, 3, [SDTCisVT<0, OtherVT>, SDTCisInt<1>, SDTCisInt<2>]>;
+
+def brx_start :
+  SDNode<"NVPTXISD::BrxStart", SDTBrxStartProfile,
+         [SDNPHasChain, SDNPOutGlue, SDNPSideEffect]>;
+def brx_item :
+  SDNode<"NVPTXISD::BrxItem", SDTBrxItemProfile,
+         [SDNPHasChain, SDNPOutGlue, SDNPInGlue, SDNPSideEffect]>;
+def brx_end :
+  SDNode<"NVPTXISD::BrxEnd", SDTBrxEndProfile,
+         [SDNPHasChain, SDNPInGlue, SDNPSideEffect]>;
+
+let isTerminator = 1, isBranch = 1, isIndirectBranch = 1 in {
+
+  def BRX_START :
+    NVPTXInst<(outs), (ins i32imm:$id),
+              "$$L_brx_$id: .branchtargets",
+              [(brx_start (i32 imm:$id))]>;
+
+  def BRX_ITEM :
+    NVPTXInst<(outs), (ins brtarget:$target),
+              "$target,",
+              [(brx_item bb:$target)]>;
+
+  def BRX_END :
+    NVPTXInst<(outs), (ins brtarget:$target, Int32Regs:$val, i32imm:$id),
+              "$target;\n\tbrx.idx \t$val, $$L_brx_$id;",
+              [(brx_end bb:$target, (i32 Int32Regs:$val), (i32 imm:$id))]>;
+}
+
+
 include "NVPTXIntrinsics.td"
 
 //-----------------------------------
diff --git a/llvm/test/CodeGen/NVPTX/jump-table.ll b/llvm/test/CodeGen/NVPTX/jump-table.ll
new file mode 100644
index 00000000000000..8dd4115e2feb63
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/jump-table.ll
@@ -0,0 +1,69 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s | FileCheck %s
+; RUN: %if ptxas %{ llc < %s | %ptxas-verify %}
+
+target triple = "nvptx64-nvidia-cuda"
+
+@out = addrspace(1) global i32 0, align 4
+
+define void @foo(i32 %i) {
+; CHECK-LABEL: foo(
+; CHECK:       {
+; CHECK-NEXT:    .reg .pred %p<2>;
+; CHECK-NEXT:    .reg .b32 %r<7>;
+; CHECK-EMPTY:
+; CHECK-NEXT:  // %bb.0: // %entry
+; CHECK-NEXT:    ld.param.u32 %r2, [foo_param_0];
+; CHECK-NEXT:    setp.gt.u32 %p1, %r2, 3;
+; CHECK-NEXT:    @%p1 bra $L__BB0_6;
+; CHECK-NEXT:  // %bb.1: // %entry
+; CHECK-NEXT:    $L_brx_0: .branchtargets
+; CHECK-NEXT:    $L__BB0_2,
+; CHECK-NEXT:    $L__BB0_3,
+; CHECK-NEXT:    $L__BB0_4,
+; CHECK-NEXT:    $L__BB0_5;
+; CHECK-NEXT:    brx.idx %r2, $L_brx_0;
+; CHECK-NEXT:  $L__BB0_2: // %case0
+; CHECK-NEXT:    mov.b32 %r6, 0;
+; CHECK-NEXT:    st.global.u32 [out], %r6;
+; CHECK-NEXT:    bra.uni $L__BB0_6;
+; CHECK-NEXT:  $L__BB0_4: // %case2
+; CHECK-NEXT:    mov.b32 %r4, 2;
+; CHECK-NEXT:    st.global.u32 [out], %r4;
+; CHECK-NEXT:    bra.uni $L__BB0_6;
+; CHECK-NEXT:  $L__BB0_5: // %case3
+; CHECK-NEXT:    mov.b32 %r3, 3;
+; CHECK-NEXT:    st.global.u32 [out], %r3;
+; CHECK-NEXT:    bra.uni $L__BB0_6;
+; CHECK-NEXT:  $L__BB0_3: // %case1
+; CHECK-NEXT:    mov.b32 %r5, 1;
+; CHECK-NEXT:    st.global.u32 [out], %r5;
+; CHECK-NEXT:  $L__BB0_6: // %end
+; CHECK-NEXT:    ret;
+entry:
+  switch i32 %i, label %end [
+    i32 0, label %case0
+    i32 1, label %case1
+    i32 2, label %case2
+    i32 3, label %case3
+  ]
+
+case0:
+  store i32 0, ptr addrspace(1) @out, align 4
+  br label %end
+
+case1:
+  store i32 1, ptr addrspace(1) @out, align 4
+  br label %end
+
+case2:
+  store i32 2, ptr addrspace(1) @out, align 4
+  br label %end
+
+case3:
+  store i32 3, ptr addrspace(1) @out, align 4
+  br label %end
+
+end:
+  ret void
+}

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Does brx.idx buy us a better code on SASS level? or is this mostly cosmetic sugar on PTX level?


def BRX_ITEM :
NVPTXInst<(outs), (ins brtarget:$target),
"$target,",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We may want to indent the labels in the list. Right now they seem to end up aligned on the instruction boundary, while they are actually arguments of the .branchtargets above. Does not impact functionality, but it looks somewhat odd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@AlexMaclean
Copy link
Member Author

Does brx.idx buy us a better code on SASS level? or is this mostly cosmetic sugar on PTX level?

This does lead to improvements on the SASS level in some cases, though this is of course very dependent on ptxas optimizations.

@AlexMaclean AlexMaclean merged commit ba97697 into llvm:main Aug 8, 2024
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 8, 2024

LLVM Buildbot has detected a new failure on builder llvm-nvptx-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/2997

Here is the relevant piece of the build log for the reference:

Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/Generic/switch-lower-feature.ll' FAILED ********************
Exit Code: 134

Command Output (stderr):
--
RUN: at line 1: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/CodeGen/Generic/switch-lower-feature.ll
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/llc
llc: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/lib/CodeGen/MachineBlockPlacement.cpp:2869: void {anonymous}::MachineBlockPlacement::buildCFGChains(): Assertion `(!TII->analyzeBranch(*PrevBB, TBB, FBB, Cond) || !PrevBB->canFallThrough()) && "Unexpected block with un-analyzable fallthrough!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/llc
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'Branch Probability Basic Block Placement' on function '@test2'
 #0 0x00007eee406a37c0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/lib/libLLVMSupport.so.20.0git+0x1dc7c0)
 #1 0x00007eee406a0bdf llvm::sys::RunSignalHandlers() (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/lib/libLLVMSupport.so.20.0git+0x1d9bdf)
 #2 0x00007eee406a0d35 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007eee3fe42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007eee3fe969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x00007eee3fe969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #6 0x00007eee3fe969fc pthread_kill ./nptl/pthread_kill.c:89:10
 #7 0x00007eee3fe42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #8 0x00007eee3fe287f3 abort ./stdlib/abort.c:81:7
 #9 0x00007eee3fe2871b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#10 0x00007eee3fe39e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#11 0x00007eee4255293f (anonymous namespace)::MachineBlockPlacement::buildCFGChains() MachineBlockPlacement.cpp:0:0
#12 0x00007eee42552d24 (anonymous namespace)::MachineBlockPlacement::runOnMachineFunction(llvm::MachineFunction&) (.part.0) MachineBlockPlacement.cpp:0:0
#13 0x00007eee425c4ca7 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#14 0x00007eee40aba04c llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/lib/libLLVMCore.so.20.0git+0x2d404c)
#15 0x00007eee40aba479 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/lib/libLLVMCore.so.20.0git+0x2d4479)
#16 0x00007eee40ab9386 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/lib/libLLVMCore.so.20.0git+0x2d3386)
#17 0x00005bf9c5fe4906 compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
#18 0x00005bf9c5fda666 main (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/llc+0xf666)
#19 0x00007eee3fe29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#20 0x00007eee3fe29e40 call_init ./csu/../csu/libc-start.c:128:20
#21 0x00007eee3fe29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#22 0x00005bf9c5fdb1c5 _start (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/llc+0x101c5)
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/CodeGen/Generic/Output/switch-lower-feature.ll.script: line 1: 4043527 Aborted                 (core dumped) /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/CodeGen/Generic/switch-lower-feature.ll

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 8, 2024

LLVM Buildbot has detected a new failure on builder llvm-nvptx64-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/2999

Here is the relevant piece of the build log for the reference:

Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/Generic/switch-lower-feature.ll' FAILED ********************
Exit Code: 134

Command Output (stderr):
--
RUN: at line 1: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/CodeGen/Generic/switch-lower-feature.ll
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/llc
llc: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/lib/CodeGen/MachineBlockPlacement.cpp:2869: void {anonymous}::MachineBlockPlacement::buildCFGChains(): Assertion `(!TII->analyzeBranch(*PrevBB, TBB, FBB, Cond) || !PrevBB->canFallThrough()) && "Unexpected block with un-analyzable fallthrough!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/llc
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'Branch Probability Basic Block Placement' on function '@test2'
 #0 0x00007fdbcb2527c0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/lib/libLLVMSupport.so.20.0git+0x1dc7c0)
 #1 0x00007fdbcb24fbdf llvm::sys::RunSignalHandlers() (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/lib/libLLVMSupport.so.20.0git+0x1d9bdf)
 #2 0x00007fdbcb24fd35 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007fdbcaa42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007fdbcaa969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x00007fdbcaa969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #6 0x00007fdbcaa969fc pthread_kill ./nptl/pthread_kill.c:89:10
 #7 0x00007fdbcaa42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #8 0x00007fdbcaa287f3 abort ./stdlib/abort.c:81:7
 #9 0x00007fdbcaa2871b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#10 0x00007fdbcaa39e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#11 0x00007fdbcd10193f (anonymous namespace)::MachineBlockPlacement::buildCFGChains() MachineBlockPlacement.cpp:0:0
#12 0x00007fdbcd101d24 (anonymous namespace)::MachineBlockPlacement::runOnMachineFunction(llvm::MachineFunction&) (.part.0) MachineBlockPlacement.cpp:0:0
#13 0x00007fdbcd173ca7 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#14 0x00007fdbcb66904c llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/lib/libLLVMCore.so.20.0git+0x2d404c)
#15 0x00007fdbcb669479 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/lib/libLLVMCore.so.20.0git+0x2d4479)
#16 0x00007fdbcb668386 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/lib/libLLVMCore.so.20.0git+0x2d3386)
#17 0x000062df7c3b7906 compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
#18 0x000062df7c3ad666 main (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/llc+0xf666)
#19 0x00007fdbcaa29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#20 0x00007fdbcaa29e40 call_init ./csu/../csu/libc-start.c:128:20
#21 0x00007fdbcaa29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#22 0x000062df7c3ae1c5 _start (/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/llc+0x101c5)
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/CodeGen/Generic/Output/switch-lower-feature.ll.script: line 1: 28296 Aborted                 (core dumped) /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/llc < /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/CodeGen/Generic/switch-lower-feature.ll

--

********************


Artem-B added a commit that referenced this pull request Aug 8, 2024
AlexMaclean added a commit to AlexMaclean/llvm-project that referenced this pull request Aug 8, 2024
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX
instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx]
(https://docs.nvidia.com/cuda/parallel-thread-execution/#control-flow-instructions-brx-idx)).
Depending on the heuristics in DAG selection, `switch` statements may
now be lowered using `brx.idx`
AlexMaclean added a commit that referenced this pull request Aug 9, 2024
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX
instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx]
(https://docs.nvidia.com/cuda/parallel-thread-execution/#control-flow-instructions-brx-idx)).
Depending on the heuristics in DAG selection, `switch` statements may
now be lowered using `brx.idx`.

Note: this fixes the previous issue in #102400 by adding the isBarrier
attribute to BRX_END
qiaojbao pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Aug 29, 2024
…af56bc521

Local branch amd-gfx 77eaf56 Merged main:4fe33d067c5d0894d0059418f09edc531f16ac9f into amd-gfx:5fa38fbc60f8
Remote branch main ba97697 [NVPTX] support switch statement with brx.idx (llvm#102400)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:NVPTX llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants