Skip to content

[llvm][ARM]Add widen global arrays pass #107120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Oct 17, 2024
Merged

Conversation

nasherm
Copy link
Contributor

@nasherm nasherm commented Sep 3, 2024

  • Pass optimizes memcpy's by padding out destinations and sources to a full word to make backend generate full word loads instead of loading a single byte (ldrb) and/or half word (ldrh). Only pads destination when it's a stack allocated constant size array and source when it's constant array. Heuristic to decide whether to pad or not is very basic and could be improved to allow more examples to be padded.
  • Pass works within GlobalOpt but is disabled by default on all targets except ARM.

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2024

@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Nashe Mncube (nasherm)

Changes
  • Pass optimizes memcpy's by padding out destinations and sources to a full word to make ARM backend generate full word loads instead of loading a single byte (ldrb) and/or half word (ldrh). Only pads destination when it's a stack allocated constant size array and source when it's constant string. Heuristic to decide whether to pad or not is very basic and could be improved to allow more examples to be padded.
  • Pass works at the midend level instead of being added in overridden method ARMPassConfig::addIRPasses(). This is because addIRPasses are run right at the end just before the llvm midend IR is lowered into the SelectionDag IR. This pass works better if it is in the midend because other optimizations such as dead code elimination can be run afterwards and delete the old unreferenced global string that has been replaced with the padded version. The other reason it's better in the midend is that it makes writing the tests easier as opt is able to run midend level passes. None the less, the pass checks if the it's being run on code targeted with an ARM triple if not then it doesn't run.

Patch is 25.19 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107120.diff

13 Files Affected:

  • (added) llvm/include/llvm/Transforms/Scalar/ARMWidenStrings.h (+28)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
  • (modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+6)
  • (modified) llvm/lib/Passes/PassRegistry.def (+1)
  • (added) llvm/lib/Transforms/Scalar/ARMWidenStrings.cpp (+236)
  • (modified) llvm/lib/Transforms/Scalar/CMakeLists.txt (+1)
  • (added) llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-lengths-dont-match.ll (+29)
  • (added) llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-more-than-64-bytes.ll (+30)
  • (added) llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-ptrtoint.ll (+47)
  • (added) llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-struct-test.ll (+53)
  • (added) llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-test1.ll (+28)
  • (added) llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-test2.ll (+24)
  • (added) llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-volatile.ll (+30)
diff --git a/llvm/include/llvm/Transforms/Scalar/ARMWidenStrings.h b/llvm/include/llvm/Transforms/Scalar/ARMWidenStrings.h
new file mode 100755
index 00000000000000..d78f0219c03037
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Scalar/ARMWidenStrings.h
@@ -0,0 +1,28 @@
+//===- ARMWidenStrings.h --------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file provides the interface for the ArmWidenStrings pass
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_SCALAR_ARMWIDENSTRINGS_H
+#define LLVM_TRANSFORMS_SCALAR_ARMWIDENSTRINGS_H
+
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+class Module;
+
+struct ARMWidenStringsPass : PassInfoMixin<ARMWidenStringsPass> {
+  PreservedAnalyses run(Function &F, FunctionAnalysisManager &);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_SCALAR_ARMWIDENSTRINGS_H
\ No newline at end of file
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 1df1449fce597c..6b989231cb9861 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -207,6 +207,7 @@
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar/ADCE.h"
+#include "llvm/Transforms/Scalar/ARMWidenStrings.h"
 #include "llvm/Transforms/Scalar/AlignmentFromAssumptions.h"
 #include "llvm/Transforms/Scalar/AnnotationRemarks.h"
 #include "llvm/Transforms/Scalar/BDCE.h"
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 9c3d49cabbd38c..b75612c410f07d 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -80,6 +80,7 @@
 #include "llvm/Transforms/Instrumentation/PGOForceFunctionAttrs.h"
 #include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
 #include "llvm/Transforms/Scalar/ADCE.h"
+#include "llvm/Transforms/Scalar/ARMWidenStrings.h"
 #include "llvm/Transforms/Scalar/AlignmentFromAssumptions.h"
 #include "llvm/Transforms/Scalar/AnnotationRemarks.h"
 #include "llvm/Transforms/Scalar/BDCE.h"
@@ -1513,6 +1514,11 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
   // from the TargetLibraryInfo.
   OptimizePM.addPass(InjectTLIMappings());
 
+  bool IsARM = TM && TM->getTargetTriple().isARM();
+  // Optimizes memcpy by padding arrays to exploit alignment
+  if (IsARM && Level.getSizeLevel() == 0 && Level.getSpeedupLevel() > 1)
+    OptimizePM.addPass(ARMWidenStringsPass());
+
   addVectorPasses(Level, OptimizePM, /* IsFullLTO */ false);
 
   // LoopSink pass sinks instructions hoisted by LICM, which serves as a
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index d6067089c6b5c1..55566f43e5435d 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -489,6 +489,7 @@ FUNCTION_PASS("view-dom-only", DomOnlyViewer())
 FUNCTION_PASS("view-post-dom", PostDomViewer())
 FUNCTION_PASS("view-post-dom-only", PostDomOnlyViewer())
 FUNCTION_PASS("wasm-eh-prepare", WasmEHPreparePass())
+FUNCTION_PASS("arm-widen-strings", ARMWidenStringsPass())
 #undef FUNCTION_PASS
 
 #ifndef FUNCTION_PASS_WITH_PARAMS
diff --git a/llvm/lib/Transforms/Scalar/ARMWidenStrings.cpp b/llvm/lib/Transforms/Scalar/ARMWidenStrings.cpp
new file mode 100644
index 00000000000000..5a3c470861cf45
--- /dev/null
+++ b/llvm/lib/Transforms/Scalar/ARMWidenStrings.cpp
@@ -0,0 +1,236 @@
+// ARMWidenStrings.cpp - Widen strings to word boundaries to speed up
+// programs that use simple strcpy's with constant strings as source
+// and stack allocated array for destination.
+
+#define DEBUG_TYPE "arm-widen-strings"
+
+#include "llvm/Transforms/Scalar/ARMWidenStrings.h"
+#include "llvm/Analysis/LoopInfo.h"
+#include "llvm/IR/BasicBlock.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/Operator.h"
+#include "llvm/IR/ValueSymbolTable.h"
+#include "llvm/Pass.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/TargetParser/Triple.h"
+#include "llvm/Transforms/Scalar.h"
+
+using namespace llvm;
+
+cl::opt<bool> DisableARMWidenStrings("disable-arm-widen-strings");
+
+namespace {
+
+class ARMWidenStrings {
+public:
+  /*
+  Max number of bytes that memcpy allows for lowering to load/stores before it
+  uses library function (__aeabi_memcpy).  This is the same value returned by
+  ARMSubtarget::getMaxInlineSizeThreshold which I would have called in place of
+  the constant int but can't get access to the subtarget info class from the
+  midend.
+  */
+  const unsigned int MemcpyInliningLimit = 64;
+
+  bool run(Function &F);
+};
+
+static bool IsCharArray(Type *t) {
+  const unsigned int CHAR_BIT_SIZE = 8;
+  return t && t->isArrayTy() && t->getArrayElementType()->isIntegerTy() &&
+         t->getArrayElementType()->getIntegerBitWidth() == CHAR_BIT_SIZE;
+}
+
+bool ARMWidenStrings::run(Function &F) {
+  if (DisableARMWidenStrings) {
+    return false;
+  }
+
+  if (Triple(F.getParent()->getTargetTriple()).isARM()) {
+    LLVM_DEBUG(
+        dbgs() << "Pass only runs on ARM as hasn't been benchmarked on other "
+                  "targets\n");
+    return false;
+  }
+  LLVM_DEBUG(dbgs() << "Running ARMWidenStrings on module " << F.getName()
+                    << "\n");
+
+  for (Function::iterator b = F.begin(); b != F.end(); ++b) {
+    for (BasicBlock::iterator i = b->begin(); i != b->end(); ++i) {
+      CallInst *CI = dyn_cast<CallInst>(i);
+      if (!CI) {
+        continue;
+      }
+
+      Function *CallMemcpy = CI->getCalledFunction();
+      // find out if the current call instruction is a call to llvm memcpy
+      // intrinsics
+      if (CallMemcpy == NULL || !CallMemcpy->isIntrinsic() ||
+          CallMemcpy->getIntrinsicID() != Intrinsic::memcpy) {
+        continue;
+      }
+
+      LLVM_DEBUG(dbgs() << "Found call to strcpy/memcpy:\n" << *CI << "\n");
+
+      auto *Alloca = dyn_cast<AllocaInst>(CI->getArgOperand(0));
+      auto *SourceVar = dyn_cast<GlobalVariable>(CI->getArgOperand(1));
+      auto *BytesToCopy = dyn_cast<ConstantInt>(CI->getArgOperand(2));
+      auto *IsVolatile = dyn_cast<ConstantInt>(CI->getArgOperand(3));
+
+      if (!BytesToCopy) {
+        LLVM_DEBUG(dbgs() << "Number of bytes to copy is null\n");
+        continue;
+      }
+
+      uint64_t NumBytesToCopy = BytesToCopy->getZExtValue();
+
+      if (!Alloca) {
+        LLVM_DEBUG(dbgs() << "Destination isn't a Alloca\n");
+        continue;
+      }
+
+      if (!SourceVar) {
+        LLVM_DEBUG(dbgs() << "Source isn't a global constant variable\n");
+        continue;
+      }
+
+      if (!IsVolatile || IsVolatile->isOne()) {
+        LLVM_DEBUG(
+            dbgs() << "Not widening strings for this memcpy because it's "
+                      "a volatile operations\n");
+        continue;
+      }
+
+      if (NumBytesToCopy % 4 == 0) {
+        LLVM_DEBUG(dbgs() << "Bytes to copy in strcpy/memcpy is already word "
+                             "aligned so nothing to do here.\n");
+        continue;
+      }
+
+      if (!SourceVar->hasInitializer() || !SourceVar->isConstant() ||
+          !SourceVar->hasLocalLinkage() || !SourceVar->hasGlobalUnnamedAddr()) {
+        LLVM_DEBUG(dbgs() << "Source is not constant global, thus it's "
+                             "mutable therefore it's not safe to pad\n");
+        continue;
+      }
+
+      ConstantDataArray *SourceDataArray =
+          dyn_cast<ConstantDataArray>(SourceVar->getInitializer());
+      if (!SourceDataArray || !IsCharArray(SourceDataArray->getType())) {
+        LLVM_DEBUG(dbgs() << "Source isn't a constant data array\n");
+        continue;
+      }
+
+      if (!Alloca->isStaticAlloca()) {
+        LLVM_DEBUG(dbgs() << "Destination allocation isn't a static "
+                             "constant which is locally allocated in this "
+                             "function, so skipping.\n");
+        continue;
+      }
+
+      // Make sure destination is definitley a char array.
+      if (!IsCharArray(Alloca->getAllocatedType())) {
+        LLVM_DEBUG(dbgs() << "Destination doesn't look like a constant char (8 "
+                             "bits) array\n");
+        continue;
+      }
+      LLVM_DEBUG(dbgs() << "With Alloca: " << *Alloca << "\n");
+
+      uint64_t DZSize = Alloca->getAllocatedType()->getArrayNumElements();
+      uint64_t SZSize = SourceDataArray->getType()->getNumElements();
+
+      // For safety purposes lets add a constraint and only padd when
+      // num bytes to copy == destination array size == source string
+      // which is a constant
+      LLVM_DEBUG(dbgs() << "Number of bytes to copy is: " << NumBytesToCopy
+                        << "\n");
+      LLVM_DEBUG(dbgs() << "Size of destination array is: " << DZSize << "\n");
+      LLVM_DEBUG(dbgs() << "Size of source array is: " << SZSize << "\n");
+      if (NumBytesToCopy != DZSize || DZSize != SZSize) {
+        LLVM_DEBUG(dbgs() << "Size of number of bytes to copy, destination "
+                             "array and source string don't match, so "
+                             "skipping\n");
+        continue;
+      }
+      LLVM_DEBUG(dbgs() << "Going to widen.\n");
+      unsigned int NumBytesToPad = 4 - (NumBytesToCopy % 4);
+      LLVM_DEBUG(dbgs() << "Number of bytes to pad by is " << NumBytesToPad
+                        << "\n");
+      unsigned int TotalBytes = NumBytesToCopy + NumBytesToPad;
+
+      if (TotalBytes > MemcpyInliningLimit) {
+        LLVM_DEBUG(
+            dbgs() << "Not going to pad because total number of bytes is "
+                   << TotalBytes
+                   << "  which be greater than the inlining "
+                      "limit for memcpy which is "
+                   << MemcpyInliningLimit << "\n");
+        continue;
+      }
+
+      // update destination char array to be word aligned (memcpy(X,...,...))
+      IRBuilder<> BuildAlloca(Alloca);
+      AllocaInst *NewAlloca = cast<AllocaInst>(BuildAlloca.CreateAlloca(
+          ArrayType::get(Alloca->getAllocatedType()->getArrayElementType(),
+                         NumBytesToCopy + NumBytesToPad)));
+      NewAlloca->takeName(Alloca);
+      NewAlloca->setAlignment(Alloca->getAlign());
+      Alloca->replaceAllUsesWith(NewAlloca);
+
+      LLVM_DEBUG(dbgs() << "Updating users of destination stack object to use "
+                        << "new size\n");
+
+      // update source to be word aligned (memcpy(...,X,...))
+      // create replacement string with padded null bytes.
+      StringRef Data = SourceDataArray->getRawDataValues();
+      std::vector<uint8_t> StrData(Data.begin(), Data.end());
+      for (unsigned int p = 0; p < NumBytesToPad; p++)
+        StrData.push_back('\0');
+      auto Arr = ArrayRef(StrData.data(), TotalBytes);
+
+      // create new padded version of global variable string.
+      Constant *SourceReplace = ConstantDataArray::get(F.getContext(), Arr);
+      GlobalVariable *NewGV = new GlobalVariable(
+          *F.getParent(), SourceReplace->getType(), true,
+          SourceVar->getLinkage(), SourceReplace, SourceReplace->getName());
+
+      // copy any other attributes from original global variable string
+      // e.g. unamed_addr
+      NewGV->copyAttributesFrom(SourceVar);
+      NewGV->takeName(SourceVar);
+
+      // replace intrinsic source.
+      CI->setArgOperand(1, NewGV);
+
+      // Update number of bytes to copy (memcpy(...,...,X))
+      CI->setArgOperand(2,
+                        ConstantInt::get(BytesToCopy->getType(), TotalBytes));
+      LLVM_DEBUG(dbgs() << "Padded dest/source and increased number of bytes:\n"
+                        << *CI << "\n"
+                        << *NewAlloca << "\n");
+    }
+  }
+  return true;
+}
+
+} // end of anonymous namespace
+
+PreservedAnalyses ARMWidenStringsPass::run(Function &F,
+                                           FunctionAnalysisManager &AM) {
+  bool Changed = ARMWidenStrings().run(F);
+  if (!Changed)
+    return PreservedAnalyses::all();
+
+  PreservedAnalyses Preserved;
+  Preserved.preserveSet(CFGAnalyses::ID());
+  Preserved.preserve<LoopAnalysis>();
+  return Preserved;
+}
diff --git a/llvm/lib/Transforms/Scalar/CMakeLists.txt b/llvm/lib/Transforms/Scalar/CMakeLists.txt
index 939a1457239567..a9607e4ebc6583 100644
--- a/llvm/lib/Transforms/Scalar/CMakeLists.txt
+++ b/llvm/lib/Transforms/Scalar/CMakeLists.txt
@@ -2,6 +2,7 @@ add_llvm_component_library(LLVMScalarOpts
   ADCE.cpp
   AlignmentFromAssumptions.cpp
   AnnotationRemarks.cpp
+  ARMWidenStrings.cpp
   BDCE.cpp
   CallSiteSplitting.cpp
   ConstantHoisting.cpp
diff --git a/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-lengths-dont-match.ll b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-lengths-dont-match.ll
new file mode 100644
index 00000000000000..a34ddc2ae2a29a
--- /dev/null
+++ b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-lengths-dont-match.ll
@@ -0,0 +1,29 @@
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -O2 -S | FileCheck %s
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -passes="default<O2>" -S | FileCheck %s
+target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv6m-arm-none-eabi"
+
+; CHECK: [17 x i8]
+@.str = private unnamed_addr constant [17 x i8] c"aaaaaaaaaaaaaaaa\00", align 1
+
+; Function Attrs: nounwind
+define hidden void @foo() local_unnamed_addr #0 {
+entry:
+  %something = alloca [20 x i8], align 1
+  call void @llvm.lifetime.start(i64 20, ptr nonnull %something) #3
+  call void @llvm.memcpy.p0i8.p0i8.i32(ptr align 1 nonnull %something, ptr align 1 @.str, i32 17, i1 false)
+  %call2 = call i32 @bar(ptr nonnull %something) #3
+  call void @llvm.lifetime.end(i64 20, ptr nonnull %something) #3
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.start(i64, ptr nocapture) #1
+
+declare i32 @bar(...) local_unnamed_addr #2
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.end(i64, ptr nocapture) #1
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.memcpy.p0i8.p0i8.i32(ptr nocapture writeonly, ptr nocapture readonly, i32, i1) #1
diff --git a/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-more-than-64-bytes.ll b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-more-than-64-bytes.ll
new file mode 100644
index 00000000000000..15c196b62bc9b2
--- /dev/null
+++ b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-more-than-64-bytes.ll
@@ -0,0 +1,30 @@
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -O3 -S | FileCheck %s
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -passes="default<O3>" -S | FileCheck %s
+target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv6m-arm-none-eabi"
+
+; CHECK: [65 x i8]
+; CHECK-NOT: [68 x i8]
+@.str = private unnamed_addr constant [65 x i8] c"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzz\00", align 1
+
+; Function Attrs: nounwind
+define hidden void @foo() local_unnamed_addr #0 {
+entry:
+  %something = alloca [65 x i8], align 1
+  call void @llvm.lifetime.start(i64 65, ptr nonnull %something) #3
+  call void @llvm.memcpy.p0i8.p0i8.i32(ptr align 1 nonnull %something, ptr align 1 @.str, i32 65, i1 false)
+  %call2 = call i32 @bar(ptr nonnull %something) #3
+  call void @llvm.lifetime.end(i64 65, ptr nonnull %something) #3
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.start(i64, ptr nocapture) #1
+
+declare i32 @bar(...) local_unnamed_addr #2
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.end(i64, ptr nocapture) #1
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.memcpy.p0i8.p0i8.i32(ptr nocapture writeonly, ptr nocapture readonly, i32, i1) #1
diff --git a/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-ptrtoint.ll b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-ptrtoint.ll
new file mode 100644
index 00000000000000..b4cb1beee92535
--- /dev/null
+++ b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-ptrtoint.ll
@@ -0,0 +1,47 @@
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -O2 -S | FileCheck %s
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -passes="default<O2>" -S | FileCheck %s
+target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv7m-arm-none-eabi"
+
+; This test uses ptrtoint, but still should be handled correctly.
+; The [45 x i8] string should be optimised away (i.e unused)
+; CHECK: [48 x i8]
+; CHECK-NOT: [45 x i8]
+@f.string1 = private unnamed_addr constant [45 x i8] c"The quick brown dog jumps over the lazy fox.\00", align 1
+
+; Function Attrs: nounwind
+define hidden i32 @f() {
+entry:
+  %string1 = alloca [45 x i8], align 1
+  %pos = alloca i32, align 4
+  %token = alloca ptr, align 4
+  call void @llvm.lifetime.start.p0i8(i64 45, ptr %string1)
+  call void @llvm.memcpy.p0i8.p0i8.i32(ptr align 1 %string1, ptr align 1 @f.string1, i32 45, i1 false)
+  call void @llvm.lifetime.start.p0i8(i64 4, ptr %pos)
+  call void @llvm.lifetime.start.p0i8(i64 4, ptr %token)
+  %call = call ptr @strchr(ptr %string1, i32 101)
+  store ptr %call, ptr %token, align 4
+  %0 = load ptr, ptr %token, align 4
+  %sub.ptr.lhs.cast = ptrtoint ptr %0 to i32
+  %sub.ptr.rhs.cast = ptrtoint ptr %string1 to i32
+  %sub.ptr.sub = sub i32 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
+  %add = add nsw i32 %sub.ptr.sub, 1
+  store i32 %add, ptr %pos, align 4
+  %1 = load i32, ptr %pos, align 4
+  call void @llvm.lifetime.end.p0i8(i64 4, ptr %token)
+  call void @llvm.lifetime.end.p0i8(i64 4, ptr %pos)
+  call void @llvm.lifetime.end.p0i8(i64 45, ptr %string1)
+  ret i32 %1
+}
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.start.p0i8(i64, ptr nocapture)
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.memcpy.p0i8.p0i8.i32(ptr nocapture writeonly, ptr nocapture readonly, i32, i1)
+
+; Function Attrs: nounwind
+declare ptr @strchr(ptr, i32)
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.end.p0i8(i64, ptr nocapture)
diff --git a/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-struct-test.ll b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-struct-test.ll
new file mode 100644
index 00000000000000..b852944c3f876f
--- /dev/null
+++ b/llvm/test/Transforms/ARMWidenStrings/arm-widen-strings-struct-test.ll
@@ -0,0 +1,53 @@
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -O3 -S | FileCheck %s
+; RUN: opt < %s -mtriple=arm-arm-none-eabi -passes="default<O3>" -S | FileCheck %s
+target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv6m-arm-none-eabi"
+
+%struct.P = type { i32, [13 x i8] }
+
+; CHECK-NOT: [16 x i8]
+@.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00", align 1
+@.str.1 = private unnamed_addr constant [4 x i8] c"%s\0A\00", align 1
+@__ARM_use_no_argv = global i32 1, section ".ARM.use_no_argv", align 4
+@llvm.used = appending global [1 x ptr] [ptr @__ARM_use_no_argv], section "llvm.metadata"
+
+; Function Attrs: nounwind
+define hidden i32 @main() local_unnamed_addr #0 {
+entry:
+  %p = alloca %struct.P, align 4
+  call void @llvm.lifetime.start(i64 20, ptr nonnull %p) #2
+  store i32 10, ptr %p, align 4, !tbaa !3
+  %arraydecay = getelementptr inbounds %struct.P, ptr %p, i32 0, i32 1, i32 0
+  call void @llvm.memcpy.p0i8.p0i8.i32(ptr align 1 %arraydecay, ptr align 1 @.str, i32 13, i1 false)
+  %puts = call i32 @puts(ptr %arraydecay)
+  call void @llvm.lifetime.end(i64 20, ptr nonnull %p) #2
+  ret i32 0
+}
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.start(i64, ptr nocapture) #1
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.lifetime.end(i64, ptr nocapture) #1
+
+; Function Attrs: argmemonly nounwind
+declare void @llvm.memcpy.p0i8.p0i8.i32(ptr nocapture writeonly, ptr noca...
[truncated]

@aeubanks
Copy link
Contributor

aeubanks commented Sep 3, 2024

if this is actually arm-specific, please use the registerPassBuilderCallbacks framework

some performance numbers in the description would be helpful

adding some arm people

@aeubanks aeubanks requested a review from davemgreen September 3, 2024 17:58
@aeubanks
Copy link
Contributor

aeubanks commented Sep 3, 2024

also I would split PRs to implement the pass and add it to the pipeline

@davemgreen
Copy link
Collaborator

I believe there was talk a long time ago about adding this to an existing pass such as the GlobalOpts pass or CGP. It sounds like CGP is too late for it, could it be a part of GlobalOpt or some other pass?

@nasherm
Copy link
Contributor Author

nasherm commented Sep 4, 2024

if this is actually arm-specific, please use the registerPassBuilderCallbacks framework

some performance numbers in the description would be helpful

I intend to rework the patch to make use of this and benchmark

also I would split PRs to implement the pass and add it to the pipeline

Sure, no problem

@efriedma-quic
Copy link
Collaborator

Can you give a brief example of Arm asm before/after this optimization?

I suspect this generalizes to other targets, at least in some cases.

Is there some reason we can't pad globals that aren't strings?

Padding out strings probably affects string merging in the linker, so the codesize tradeoff here is sort of hard to compute.

@nasherm nasherm force-pushed the nashe/widen-strings branch from b3bca66 to cc8bf21 Compare September 9, 2024 14:32
@nasherm
Copy link
Contributor Author

nasherm commented Sep 9, 2024

I've reduced this patch down to adding the pass, as well as tests, without enabling it.

With respect to performance gain I've seen a jump of around 1% on some of our benchmarks.

I used the following (truncated) IR to show the difference in generated assembly

# example.ll
@.str = private unnamed_addr constant [10 x i8] c"123456789\00", align 1

; Function Attrs: nounwind
define hidden void @foo() #0 {
entry:
  %something = alloca [10 x i8], align 1
  %arraydecay = getelementptr inbounds [10 x i8], ptr %something, i32 0, i32 0
  %call = call ptr @strcpy(ptr %arraydecay, ptr @.str)
  %arraydecay1 = getelementptr inbounds [10 x i8], ptr %something, i32 0, i32 0
  %call2 = call i32 @bar(ptr %arraydecay1)
  ret void

Optimization off

$ opt example.ll -O2 -S | llc -mtriple=arm-arm-none-eabi -o -
..........
foo:
	.fnstart
@ %bb.0:                                @ %entry
	.save	{r4, lr}
	push	{r4, lr}
	.pad	#24
	sub	sp, sp, #24
	ldr	r12, .LCPI0_0
	add	r0, sp, #4
	mov	lr, r0
	ldm	r12!, {r1, r2, r3, r4}
	stm	lr!, {r1, r2, r3, r4}
	ldrb	r1, [r12]
	strb	r1, [lr]
	bl	bar
	add	sp, sp, #24
	pop	{r4, lr}
	mov	pc, lr
	.p2align	2
@ %bb.1:
.LCPI0_0:
	.long	.L.str
.Lfunc_end0:
	.size	foo, .Lfunc_end0-foo
	.fnend
                                        @ -- End function
	.type	.L.str,%object                  @ @.str
	.section	.rodata.str1.4,"aMS",%progbits,1
	.p2align	2, 0x0
.L.str:
	.asciz	"1234567891234567"
	.size	.L.str, 17

	.section	".note.GNU-stack","",%progbits
	.eabi_attribute	30, 1	@ Tag_ABI_optimization_goals

Optmization on

$ opt example.ll  -passes="default<O2>,arm-widen-strings" -S | llc -mtriple=arm-arm-none-eabi -o -
foo:
	.fnstart
@ %bb.0:                                @ %entry
	.save	{r4, r5, r11, lr}
	push	{r4, r5, r11, lr}
	.pad	#40
	sub	sp, sp, #40
	ldr	r12, .LCPI0_0
	add	r0, sp, #20
	mov	r2, r0
	ldm	r12, {r1, r3, r4, r5, lr}
	stm	r2, {r1, r3, r4, r5, lr}
	bl	bar
	add	sp, sp, #40
	pop	{r4, r5, r11, lr}
	mov	pc, lr
	.p2align	2
@ %bb.1:
.LCPI0_0:
	.long	.L.str
.Lfunc_end0:
	.size	foo, .Lfunc_end0-foo
	.fnend
                                        @ -- End function
	.type	.L__unnamed_1,%object           @ @0
	.section	.rodata.str1.1,"aMS",%progbits,1
.L__unnamed_1:
	.asciz	"1234567891234567"
	.size	.L__unnamed_1, 17

	.type	.L.str,%object                  @ @.str
	.section	.rodata,"a",%progbits
	.p2align	2, 0x0
.L.str:
	.asciz	"1234567891234567\000\000\000"
	.size	.L.str, 20

	.section	".note.GNU-stack","",%progbits
	.eabi_attribute	30, 1	@ Tag_ABI_optimization_goals

Diff of assembly for readability

24,27c24,27
< 	.save	{r4, lr}
< 	push	{r4, lr}
< 	.pad	#24
< 	sub	sp, sp, #24
---
> 	.save	{r4, r5, r11, lr}
> 	push	{r4, r5, r11, lr}
> 	.pad	#40
> 	sub	sp, sp, #40
29,34c29,32
< 	add	r0, sp, #4
< 	mov	lr, r0
< 	ldm	r12!, {r1, r2, r3, r4}
< 	stm	lr!, {r1, r2, r3, r4}
< 	ldrb	r1, [r12]
< 	strb	r1, [lr]
---
> 	add	r0, sp, #20
> 	mov	r2, r0
> 	ldm	r12, {r1, r3, r4, r5, lr}
> 	stm	r2, {r1, r3, r4, r5, lr}
36,37c34,35
< 	add	sp, sp, #24
< 	pop	{r4, lr}
---
> 	add	sp, sp, #40
> 	pop	{r4, r5, r11, lr}
46a45,50
> 	.type	.L__unnamed_1,%object           @ @0
> 	.section	.rodata.str1.1,"aMS",%progbits,1
> .L__unnamed_1:
> 	.asciz	"1234567891234567"
> 	.size	.L__unnamed_1, 17
> 
48c52
< 	.section	.rodata.str1.4,"aMS",%progbits,1
---
> 	.section	.rodata,"a",%progbits
51,52c55,56
< 	.asciz	"1234567891234567"
< 	.size	.L.str, 17
---
> 	.asciz	"1234567891234567\000\000\000"
> 	.size	.L.str, 20

@nasherm
Copy link
Contributor Author

nasherm commented Sep 9, 2024

Is there some reason we can't pad globals that aren't strings?

I don't think so? But there might be a reason this wasn't investigated. The work in this patch was originally authored by someone no longer at Arm

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth investigating if we can fit this easily into some pass that's already examining the uses of globals, like GlobalOpt; iterating over the whole module isn't cheap.

@nasherm
Copy link
Contributor Author

nasherm commented Sep 11, 2024

My most recent patch addresses the comments.

Probably worth investigating if we can fit this easily into some pass that's already examining the uses of globals, like GlobalOpt; iterating over the whole module isn't cheap.

I've had a look at GlobalOpt briefly and I have a few questions: if this pass were added wouldn't investigation also include seeing if this improves performance on other targets? I can see restricting the pass to ARM cores but it seems like that would it make it a poor fit for GlobalOpt. Is there something I'm missing?

Copy link

github-actions bot commented Sep 11, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@efriedma-quic
Copy link
Collaborator

If we're going to make this transform target-independent, we'll need some target-specific tuning from TargetTransformInfo or something like that. Even if the transform is profitable, the exact thresholds where it's profitable are likely to be different. (The maximum size of the global where it's relevant, and whether the best alignment boundary is 2/4/8/16 bytes, is going to vary.)

Not sure we need extensive performance measurements for other targets... if you could get measurements for some big x86 or Arm64 core, that would be nice. But you can basically see what happens on other targets by just compiling a simple example. And if we have a TTI hook, targets could easily opt-out.

@nasherm
Copy link
Contributor Author

nasherm commented Sep 13, 2024

I've rewritten the pass to be platform independent and added it to GlobalOpt. By default it's disabled for all targets except ARM.

@nasherm
Copy link
Contributor Author

nasherm commented Sep 13, 2024

I haven't had a chance to investigate performance on AArch64 or x86 machines and will not be able to until next week

@nasherm nasherm changed the title [llvm][ARM]Add ARM widen strings pass [llvm][ARM]Add widen strings pass Sep 13, 2024
The case in which copying from a global
source to a global dest wasn't handled and
caused opt to crash. This is now handled and
a new test has been added to check

Change-Id: Ieb0467797fcee888f6e95e68af4dac9c05d70a4d
Change-Id: I029312362f9dd714b2e9bc206cc002883d761b8b
Change-Id: Idc7b14cc785eb88552dd72947eb0df128baa7e90
- Removed handling of global variable destinations. We simply
  don't pad these for now
- Added check that destination array is an array type and added
  test.

Change-Id: Ifc53051952ef69c4af64827402baf7d69cab4824
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I ran some tests and as far as I can tell they ran OK now. LGTM if there are no other comments.

@nasherm nasherm force-pushed the nashe/widen-strings branch from bbe246e to 86ee9ad Compare October 16, 2024 10:51
Change-Id: Iad0539e526fb0fc116217dcbd033f8297fa5ef5f
@nasherm nasherm force-pushed the nashe/widen-strings branch from 86ee9ad to 2815d59 Compare October 16, 2024 10:53
- Added test showing behaviour of attempting to
  widen non-const globals
- Refactoring

Change-Id: I566214331bf3d889bd1409d3148aa6eab2530ed5
@nasherm nasherm merged commit ab90d27 into llvm:main Oct 17, 2024
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder llvm-nvptx64-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/6882

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-string-multi-use.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt < /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-string-multi-use.ll -mtriple=arm-none-eabi -passes=globalopt -S | /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-string-multi-use.ll
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-string-multi-use.ll
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-string-multi-use.ll:9:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING2:%.*]] = alloca [4 x i8], align 1
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:7: note: possible intended match here
 %something = alloca [3 x i8], align 1
      ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-string-multi-use.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = '<stdin>' 
          2: source_filename = "<stdin>" 
          3: target triple = "arm-unknown-none-eabi" 
          4:  
          5: @.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1 
          6:  
          7: define void @memcpy_multiple() local_unnamed_addr { 
          8: entry: 
next:9'0           X error: no match found
          9:  %something = alloca [3 x i8], align 1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:9'1           ?                                 possible intended match
         10:  %something1 = alloca [3 x i8], align 1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         11:  %something2 = alloca [3 x i8], align 1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         12:  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(3) %something, ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 3, i1 false) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         13:  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(3) %something1, ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 3, i1 false) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         14:  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(3) %something2, ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 3, i1 false) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-fuzzer running on sanitizer-buildbot12 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/159/builds/8352

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
  The OLD behavior for policy CMP0116 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
  CMakeLists.txt:6 (include)


-- Building with -fPIC
-- LLVM host triple: aarch64-unknown-linux-gnu
-- LLVM default target triple: aarch64-unknown-linux-gnu
-- Using libunwind testing configuration: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/libunwind/test/configs/llvm-libunwind-shared.cfg.in
-- Failed to locate sphinx-build executable (missing: SPHINX_EXECUTABLE) 
-- Using libc++abi testing configuration: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/libcxxabi/test/configs/llvm-libc++abi-shared.cfg.in
-- Using libc++ testing configuration: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/libcxx/test/configs/llvm-libc++-shared.cfg.in
-- ABI list file not generated for configuration aarch64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew, `check-cxx-abilist` will not be available.
CMake Deprecation Warning at /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/cmake/Modules/CMakePolicy.cmake:6 (cmake_policy):
  The OLD behavior for policy CMP0116 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
  /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/compiler-rt/CMakeLists.txt:12 (include)


-- Compiler-RT supported architectures: aarch64
-- Generated Sanitizer SUPPORTED_TOOLS list on "Linux" is "asan;lsan;hwasan;msan;tsan;ubsan"
-- sanitizer_common tests on "Linux" will run against "asan;lsan;hwasan;msan;tsan;ubsan"
-- Configuring done (2.1s)
-- Generating done (0.3s)
-- Build files have been written to: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/runtimes/runtimes-bins
[466/470] Linking CXX executable bin/c-index-test
[467/470] Performing build step for 'runtimes'
[0/7] Performing build step for 'libcxx_fuzzer_aarch64'
ninja: no work to do.
[3/7] Building CXX object compiler-rt/lib/hwasan/CMakeFiles/RTHwasan_dynamic_version_script_dummy.aarch64.dir/dummy.cpp.o
[5/7] Linking CXX shared library /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/lib/clang/20/lib/aarch64-unknown-linux-gnu/libclang_rt.ubsan_standalone.so
[6/7] Linking CXX shared library /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/lib/clang/20/lib/aarch64-unknown-linux-gnu/libclang_rt.hwasan.so
[7/7] Linking CXX shared library /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/lib/clang/20/lib/aarch64-unknown-linux-gnu/libclang_rt.asan.so
[468/470] No install step for 'runtimes'
[470/470] Completed 'runtimes'
1781e80fb71590d5b35304f7dc780f1e  llvm_build0/bin/clang
@@@BUILD_STEP get fuzzer-test-suite @@@
fatal: unable to access 'https://github.com/google/fuzzer-test-suite.git/': Recv failure: Connection reset by peer
Step 7 (stage1 build all) failure: stage1 build all (failure)
...
[462/470] Completed 'builtins'
[463/470] Clobbering bootstrap build and stamp directories
[463/470] Performing configure step for 'runtimes'
Not searching for unused variables given on the command line.
loading initial cache file /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/projects/runtimes/tmp/runtimes-cache-Release.cmake
CMake Deprecation Warning at /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/cmake/Modules/CMakePolicy.cmake:6 (cmake_policy):
  The OLD behavior for policy CMP0116 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
  CMakeLists.txt:6 (include)
-- Building with -fPIC
-- LLVM host triple: aarch64-unknown-linux-gnu
-- LLVM default target triple: aarch64-unknown-linux-gnu
-- Using libunwind testing configuration: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/libunwind/test/configs/llvm-libunwind-shared.cfg.in
-- Failed to locate sphinx-build executable (missing: SPHINX_EXECUTABLE) 
-- Using libc++abi testing configuration: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/libcxxabi/test/configs/llvm-libc++abi-shared.cfg.in
-- Using libc++ testing configuration: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/libcxx/test/configs/llvm-libc++-shared.cfg.in
-- ABI list file not generated for configuration aarch64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew, `check-cxx-abilist` will not be available.
CMake Deprecation Warning at /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/cmake/Modules/CMakePolicy.cmake:6 (cmake_policy):
  The OLD behavior for policy CMP0116 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
  /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm-project/compiler-rt/CMakeLists.txt:12 (include)
-- Compiler-RT supported architectures: aarch64
-- Generated Sanitizer SUPPORTED_TOOLS list on "Linux" is "asan;lsan;hwasan;msan;tsan;ubsan"
-- sanitizer_common tests on "Linux" will run against "asan;lsan;hwasan;msan;tsan;ubsan"
-- Configuring done (2.1s)
-- Generating done (0.3s)
-- Build files have been written to: /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/runtimes/runtimes-bins
[466/470] Linking CXX executable bin/c-index-test
[467/470] Performing build step for 'runtimes'
[0/7] Performing build step for 'libcxx_fuzzer_aarch64'
ninja: no work to do.
[3/7] Building CXX object compiler-rt/lib/hwasan/CMakeFiles/RTHwasan_dynamic_version_script_dummy.aarch64.dir/dummy.cpp.o
[5/7] Linking CXX shared library /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/lib/clang/20/lib/aarch64-unknown-linux-gnu/libclang_rt.ubsan_standalone.so
[6/7] Linking CXX shared library /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/lib/clang/20/lib/aarch64-unknown-linux-gnu/libclang_rt.hwasan.so
[7/7] Linking CXX shared library /home/b/sanitizer-aarch64-linux-fuzzer/build/llvm_build0/lib/clang/20/lib/aarch64-unknown-linux-gnu/libclang_rt.asan.so
[468/470] No install step for 'runtimes'
[470/470] Completed 'runtimes'
1781e80fb71590d5b35304f7dc780f1e  llvm_build0/bin/clang

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-ubuntu-fast running on sie-linux-worker while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/9579

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-strings-2.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/opt < /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-2.ll -mtriple=arm-none-eabi -passes=globalopt -S | /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-2.ll
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-2.ll
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
�[1m/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-2.ll:9:15: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m; CHECK-NEXT: [[SOMETHING:%.*]] = alloca [64 x i8], align 1
�[0;1;32m              ^
�[0m�[1m<stdin>:8:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m<stdin>:9:7: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m %something = alloca [62 x i8], align 1
�[0;1;32m      ^
�[0m
Input file: <stdin>
Check file: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-2.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m           1: �[0m�[1m�[0;1;46m; ModuleID = '<stdin>' �[0m
�[0;1;30m           2: �[0m�[1m�[0;1;46msource_filename = "<stdin>" �[0m
�[0;1;30m           3: �[0m�[1m�[0;1;46mtarget triple = "arm-unknown-none-eabi" �[0m
�[0;1;30m           4: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m           5: �[0m�[1m�[0;1;46m@.str = private unnamed_addr constant [62 x i8] c"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\00", align 1 �[0m
�[0;1;30m           6: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m           7: �[0m�[1m�[0;1;46m�[0mdefine void @foo() local_unnamed_addr {�[0;1;46m �[0m
�[0;1;32mlabel:7'0     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32mlabel:7'1     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m           8: �[0m�[1m�[0;1;46m�[0mentry:�[0;1;46m �[0m
�[0;1;32mnext:8'0      ^~~~~~
�[0m�[0;1;32mnext:8'1      ^~~~~~  captured var "ENTRY"
�[0m�[0;1;31mnext:9'0            X error: no match found
�[0m�[0;1;30m           9: �[0m�[1m�[0;1;46m %something = alloca [62 x i8], align 1 �[0m
�[0;1;31mnext:9'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;35mnext:9'1            ?                                  possible intended match
�[0m�[0;1;30m          10: �[0m�[1m�[0;1;46m call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(62) %something, ptr noundef nonnull align 1 dereferenceable(62) @.str, i32 62, i1 false) �[0m
�[0;1;31mnext:9'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m          11: �[0m�[1m�[0;1;46m %call2 = call i32 @bar(ptr nonnull %something) �[0m
�[0;1;31mnext:9'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m          12: �[0m�[1m�[0;1;46m ret void �[0m
�[0;1;31mnext:9'0      ~~~~~~~~~~
�[0m�[0;1;30m          13: �[0m�[1m�[0;1;46m} �[0m
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder clang-ve-ninja running on hpce-ve-main while building llvm at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/12/builds/7865

Here is the relevant piece of the build log for the reference
Step 4 (annotate) failure: 'python ../llvm-zorg/zorg/buildbot/builders/annotated/ve-linux.py ...' (failure)
...
[662/663] Running the LLVM regression tests
Unknown option: -C
usage: git [--version] [--help] [-c name=value]
           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
           [-p|--paginate|--no-pager] [--no-replace-objects] [--bare]
           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
           <command> [<args>]
An error occurred retrieving the git revision: Command '['git', '-C', '/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm', 'rev-parse', 'HEAD']' returned non-zero exit status 129.
-- Testing: 55770 tests, 48 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: LLVM :: Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll (40833 of 55770)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt < /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll -mtriple=arm-none-eabi -passes=globalopt -S | /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/FileCheck /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll
+ /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/FileCheck /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll
/scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll:9:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING1:%.*]] = alloca [6 x i16], align 1
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:7: note: possible intended match here
 %something = alloca [5 x i16], align 1
      ^

Input file: <stdin>
Check file: /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = '<stdin>' 
          2: source_filename = "<stdin>" 
          3: target triple = "arm-unknown-none-eabi" 
          4:  
          5: @.i16 = private unnamed_addr constant [5 x i16] [i16 1, i16 2, i16 3, i16 4, i16 5], align 1 
          6:  
          7: define void @memcpy_i16_array() local_unnamed_addr { 
          8: entry: 
next:9'0           X error: no match found
          9:  %something = alloca [5 x i16], align 1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Step 8 (check-llvm) failure: check-llvm (failure)
...
[662/663] Running the LLVM regression tests
Unknown option: -C
usage: git [--version] [--help] [-c name=value]
           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
           [-p|--paginate|--no-pager] [--no-replace-objects] [--bare]
           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
           <command> [<args>]
An error occurred retrieving the git revision: Command '['git', '-C', '/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm', 'rev-parse', 'HEAD']' returned non-zero exit status 129.
-- Testing: 55770 tests, 48 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: LLVM :: Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll (40833 of 55770)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt < /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll -mtriple=arm-none-eabi -passes=globalopt -S | /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/FileCheck /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll
+ /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/FileCheck /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll
/scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/scratch/buildbot/bothome/clang-ve-ninja/build/build_llvm/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll:9:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING1:%.*]] = alloca [6 x i16], align 1
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:7: note: possible intended match here
 %something = alloca [5 x i16], align 1
      ^

Input file: <stdin>
Check file: /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = '<stdin>' 
          2: source_filename = "<stdin>" 
          3: target triple = "arm-unknown-none-eabi" 
          4:  
          5: @.i16 = private unnamed_addr constant [5 x i16] [i16 1, i16 2, i16 3, i16 4, i16 5], align 1 
          6:  
          7: define void @memcpy_i16_array() local_unnamed_addr { 
          8: entry: 
next:9'0           X error: no match found
          9:  %something = alloca [5 x i16], align 1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder llvm-nvptx-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/6880

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-strings-1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt < /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-1.ll -mtriple=arm-none-eabi -passes=globalopt -S | /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-1.ll
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-1.ll
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-1.ll:9:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING:%.*]] = alloca [12 x i8], align 1
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:7: note: possible intended match here
 %something = alloca [10 x i8], align 1
      ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-strings-1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = '<stdin>' 
          2: source_filename = "<stdin>" 
          3: target triple = "arm-unknown-none-eabi" 
          4:  
          5: @.str = private unnamed_addr constant [10 x i8] c"123456789\00", align 1 
          6:  
          7: define void @foo() local_unnamed_addr { 
          8: entry: 
next:9'0           X error: no match found
          9:  %something = alloca [10 x i8], align 1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:9'1           ?                                  possible intended match
         10:  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(10) %something, ptr noundef nonnull align 1 dereferenceable(10) @.str, i32 10, i1 false) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         11:  %call2 = call i32 @bar(ptr nonnull %something) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         12:  ret void 
next:9'0     ~~~~~~~~~~
         13: } 
next:9'0     ~~
         14:  
next:9'0     ~
          .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder arc-builder running on arc-worker while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/6308

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /buildbot/worker/arc-folder/build/bin/opt < /buildbot/worker/arc-folder/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll -mtriple=arm-none-eabi -passes=globalopt -S | /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
+ /buildbot/worker/arc-folder/build/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
/buildbot/worker/arc-folder/build/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/buildbot/worker/arc-folder/build/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/buildbot/worker/arc-folder/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:6:10: error: CHECK: expected string not found in input
; CHECK: [4 x i8]
         ^
<stdin>:5:46: note: scanning from here
@other = private unnamed_addr global [3 x i8] c"\01\02\03", align 1
                                             ^
<stdin>:6:38: note: possible intended match here
@.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1
                                     ^
/buildbot/worker/arc-folder/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:12:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING:%.*]] = alloca [4 x i8], align 1
              ^
<stdin>:9:7: note: scanning from here
entry:
      ^
<stdin>:10:7: note: possible intended match here
 %something = alloca [3 x i8], align 1
      ^

Input file: <stdin>
Check file: /buildbot/worker/arc-folder/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '<stdin>' 
           2: source_filename = "<stdin>" 
           3: target triple = "arm-unknown-none-eabi" 
           4:  
           5: @other = private unnamed_addr global [3 x i8] c"\01\02\03", align 1 
check:6'0                                                  X~~~~~~~~~~~~~~~~~~~~~~ error: no match found
           6: @.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:6'1                                          ?                               possible intended match
           7:  
check:6'0     ~
           8: define void @memcpy_multiple() local_unnamed_addr { 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           9: entry: 
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-sles-build-only running on rocm-worker-hw-04-sles while building llvm at step 8 "Add check check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/140/builds/8956

Here is the relevant piece of the build log for the reference
Step 8 (Add check check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt < /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll -mtriple=arm-none-eabi -passes=globalopt -S | /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll
/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll:9:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING1:%.*]] = alloca [6 x i16], align 1
              ^
<stdin>:8:7: note: scanning from here
entry:
      ^
<stdin>:9:7: note: possible intended match here
 %something = alloca [5 x i16], align 1
      ^

Input file: <stdin>
Check file: /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-non-byte-array.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = '<stdin>' 
          2: source_filename = "<stdin>" 
          3: target triple = "arm-unknown-none-eabi" 
          4:  
          5: @.i16 = private unnamed_addr constant [5 x i16] [i16 1, i16 2, i16 3, i16 4, i16 5], align 1 
          6:  
          7: define void @memcpy_i16_array() local_unnamed_addr { 
          8: entry: 
next:9'0           X error: no match found
          9:  %something = alloca [5 x i16], align 1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:9'1           ?                                  possible intended match
         10:  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(10) %something, ptr noundef nonnull align 1 dereferenceable(10) @.i16, i32 10, i1 false) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         11:  %call2 = call i32 @bar(ptr nonnull %something) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         12:  ret void 
next:9'0     ~~~~~~~~~~
         13: } 
next:9'0     ~~
         14:  
next:9'0     ~
          .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder clang-cmake-x86_64-avx512-linux running on avx512-intel64 while building llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/133/builds/5328

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/opt < /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll -mtriple=arm-none-eabi -passes=globalopt -S | /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/FileCheck /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
+ /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/FileCheck /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:6:10: error: CHECK: expected string not found in input
; CHECK: [4 x i8]
         ^
<stdin>:5:46: note: scanning from here
@other = private unnamed_addr global [3 x i8] c"\01\02\03", align 1
                                             ^
<stdin>:6:38: note: possible intended match here
@.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1
                                     ^
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:12:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING:%.*]] = alloca [4 x i8], align 1
              ^
<stdin>:9:7: note: scanning from here
entry:
      ^
<stdin>:10:7: note: possible intended match here
 %something = alloca [3 x i8], align 1
      ^

Input file: <stdin>
Check file: /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '<stdin>' 
           2: source_filename = "<stdin>" 
           3: target triple = "arm-unknown-none-eabi" 
           4:  
           5: @other = private unnamed_addr global [3 x i8] c"\01\02\03", align 1 
check:6'0                                                  X~~~~~~~~~~~~~~~~~~~~~~ error: no match found
           6: @.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:6'1                                          ?                               possible intended match
           7:  
check:6'0     ~
           8: define void @memcpy_multiple() local_unnamed_addr { 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           9: entry: 
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/7798

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt < /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll -mtriple=arm-none-eabi -passes=globalopt -S | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:6:10: �[0m�[0;1;31merror: �[0m�[1mCHECK: expected string not found in input
�[0m; CHECK: [4 x i8]
�[0;1;32m         ^
�[0m�[1m<stdin>:5:46: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0m@other = private unnamed_addr global [3 x i8] c"\01\02\03", align 1
�[0;1;32m                                             ^
�[0m�[1m<stdin>:6:38: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m@.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1
�[0;1;32m                                     ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:12:15: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m; CHECK-NEXT: [[SOMETHING:%.*]] = alloca [4 x i8], align 1
�[0;1;32m              ^
�[0m�[1m<stdin>:9:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m<stdin>:10:7: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m %something = alloca [3 x i8], align 1
�[0;1;32m      ^
�[0m
Input file: <stdin>
Check file: /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m           1: �[0m�[1m�[0;1;46m; ModuleID = '<stdin>' �[0m
�[0;1;30m           2: �[0m�[1m�[0;1;46msource_filename = "<stdin>" �[0m
�[0;1;30m           3: �[0m�[1m�[0;1;46mtarget triple = "arm-unknown-none-eabi" �[0m
�[0;1;30m           4: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m           5: �[0m�[1m�[0;1;46m@other = private unnamed_addr global �[0m[3 x i8]�[0;1;46m c"\01\02\03", align 1 �[0m
�[0;1;32mcheck:4                                            ^~~~~~~~
�[0m�[0;1;31mcheck:6'0                                                  X~~~~~~~~~~~~~~~~~~~~~~ error: no match found
�[0m�[0;1;30m           6: �[0m�[1m�[0;1;46m@.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1 �[0m
�[0;1;31mcheck:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;35mcheck:6'1                                          ?                               possible intended match
�[0m�[0;1;30m           7: �[0m�[1m�[0;1;46m �[0m
�[0;1;31mcheck:6'0     ~
�[0m�[0;1;30m           8: �[0m�[1m�[0;1;46m�[0mdefine void @memcpy_multiple() local_unnamed_addr {�[0;1;46m �[0m
�[0;1;32mlabel:10      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 17, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 8 "Add check check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/7199

Here is the relevant piece of the build log for the reference
Step 8 (Add check check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/opt < /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll -mtriple=arm-none-eabi -passes=globalopt -S | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
+ /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll
+ /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/opt -mtriple=arm-none-eabi -passes=globalopt -S
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/opt: warning: failed to infer data layout: unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/opt: WARNING: failed to create target machine for 'arm-unknown-none-eabi': unable to get target for 'arm-unknown-none-eabi', see --version and --triple.
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:6:10: error: CHECK: expected string not found in input
; CHECK: [4 x i8]
         ^
<stdin>:5:46: note: scanning from here
@other = private unnamed_addr global [3 x i8] c"\01\02\03", align 1
                                             ^
<stdin>:6:38: note: possible intended match here
@.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1
                                     ^
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll:12:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[SOMETHING:%.*]] = alloca [4 x i8], align 1
              ^
<stdin>:9:7: note: scanning from here
entry:
      ^
<stdin>:10:7: note: possible intended match here
 %something = alloca [3 x i8], align 1
      ^

Input file: <stdin>
Check file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-global-dest.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = '<stdin>' 
           2: source_filename = "<stdin>" 
           3: target triple = "arm-unknown-none-eabi" 
           4:  
           5: @other = private unnamed_addr global [3 x i8] c"\01\02\03", align 1 
check:6'0                                                  X~~~~~~~~~~~~~~~~~~~~~~ error: no match found
           6: @.i8 = private unnamed_addr constant [3 x i8] c"\01\02\03", align 1 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:6'1                                          ?                               possible intended match
           7:  
check:6'0     ~
           8: define void @memcpy_multiple() local_unnamed_addr { 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           9: entry: 
...

nasherm added a commit that referenced this pull request Oct 17, 2024
nasherm added a commit that referenced this pull request Oct 17, 2024
Reverts #107120 

Unexpected build failures in post-commit pipelines. Needs investigation
@nasherm
Copy link
Contributor Author

nasherm commented Oct 17, 2024

Has been reverted due to unexpected buildbot failures

@aeubanks
Copy link
Contributor

probably just requires REQUIRES: arm-registered-target in tests?

nasherm added a commit that referenced this pull request Oct 24, 2024
This is a recommit of #107120 . The original PR was approved but failed
buildbot. The newly added tests should only be run for compilers that
support the ARM target. This has been resolved by adding a config file
for these tests.

- Pass optimizes memcpy's by padding out destinations and sources to a
  full word to make ARM backend generate full word loads instead of
  loading a single byte (ldrb) and/or half word (ldrh). Only pads
  destination when it's a stack allocated constant size array and source
  when it's constant string. Heuristic to decide whether to pad or not
  is very basic and could be improved to allow more examples to be
  padded.
- Pass works at the midend level
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…3289)

This is a recommit of llvm#107120 . The original PR was approved but failed
buildbot. The newly added tests should only be run for compilers that
support the ARM target. This has been resolved by adding a config file
for these tests.

- Pass optimizes memcpy's by padding out destinations and sources to a
  full word to make ARM backend generate full word loads instead of
  loading a single byte (ldrb) and/or half word (ldrh). Only pads
  destination when it's a stack allocated constant size array and source
  when it's constant string. Heuristic to decide whether to pad or not
  is very basic and could be improved to allow more examples to be
  padded.
- Pass works at the midend level
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants