Skip to content

AMDGPU/NewPM: Fill out passes in addCodeGenPrepare #102867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1770,8 +1770,35 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder(
}

void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const {
// AMDGPUAnnotateKernelFeaturesPass is missing here, but it will hopefully be
// deleted soon.

if (EnableLowerKernelArguments)
addPass(AMDGPULowerKernelArgumentsPass(TM));

// This lowering has been placed after codegenprepare to take advantage of
// address mode matching (which is why it isn't put with the LDS lowerings).
// It could be placed anywhere before uniformity annotations (an analysis
// that it changes by splitting up fat pointers into their components)
// but has been put before switch lowering and CFG flattening so that those
// passes can run on the more optimized control flow this pass creates in
// many cases.
//
// FIXME: This should ideally be put after the LoadStoreVectorizer.
// However, due to some annoying facts about ResourceUsageAnalysis,
// (especially as exercised in the resource-usage-dead-function test),
// we need all the function passes codegenprepare all the way through
// said resource usage analysis to run on the call graph produced
// before codegenprepare runs (because codegenprepare will knock some
// nodes out of the graph, which leads to function-level passes not
// being run on them, which causes crashes in the resource usage analysis).
addPass(AMDGPULowerBufferFatPointersPass(TM));

Base::addCodeGenPrepare(addPass);

if (isPassEnabled(EnableLoadStoreVectorizer))
addPass(LoadStoreVectorizerPass());

// LowerSwitch pass may introduce unreachable blocks that can cause unexpected
// behavior for subsequent passes. Placing it here seems better that these
// blocks would get cleaned up by UnreachableBlockElim inserted next in the
Expand Down Expand Up @@ -1839,3 +1866,12 @@ Error AMDGPUCodeGenPassBuilder::addInstSelector(AddMachinePass &addPass) const {
addPass(SILowerI1CopiesPass());
return Error::success();
}

bool AMDGPUCodeGenPassBuilder::isPassEnabled(const cl::opt<bool> &Opt,
CodeGenOptLevel Level) const {
if (Opt.getNumOccurrences())
return Opt;
if (TM.getOptLevel() < Level)
return false;
return Opt;
}
6 changes: 6 additions & 0 deletions llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,12 @@ class AMDGPUCodeGenPassBuilder
void addPreISel(AddIRPass &addPass) const;
void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const;
Error addInstSelector(AddMachinePass &) const;

/// Check if a pass is enabled given \p Opt option. The option always
/// overrides defaults if explicitly used. Otherwise its default will be used
/// given that a pass shall work at an optimization \p Level minimum.
bool isPassEnabled(const cl::opt<bool> &Opt,
CodeGenOptLevel Level = CodeGenOptLevel::Default) const;
};

} // end namespace llvm
Expand Down
Loading