[flang] Fixed LoopVersioning for array slices. #65703

vzakhari · 2023-09-08T01:02:10Z

The first test case added in the LIT test demonstrates the problem.
Even though we did not consider the inner loop as a candidate for
the transformation due to the array_coor with a slice, we decided to
version the outer loop for the same function argument.
During the cloning of the outer loop we dropped the slicing completely
producing invalid code.

I restructured the code so that we record all arg uses that cannot be
transformed (regardless of the reason), and then fixup the usage
information across the loop nests. I also noticed that we may generate
redundant contiguity checks for the inner loops, so I fixed it
since it was easy with the new way of keeping the usage data.

The first test case added in the LIT test demonstrates the problem. Even though we did not consider the inner loop as a candidate for the transformation due to the array_coor with a slice, we decided to version the outer loop for the same function argument. During the cloning of the outer loop we dropped the slicing completely producing invalid code. I restructured the code so that we record all arg uses that cannot be transformed (regardless of the reason), and then fixup the usage information across the loop nests. I also noticed that we may generate redundant contiguity checks for the inner loops, so I fixed it since it was easy with the new way of keeping the usage data.

tblah

Thank you for finding and fixing this! The changes look good to me.

tblah · 2023-09-08T09:52:59Z

flang/lib/Optimizer/Transforms/LoopVersioning.cpp

      if (op->getParentOfType<fir::DoLoopOp>() != loop)
        return;
      mlir::Value operand = op->getOperand(0);
      for (auto a : argsOfInterest) {
        if (a.arg == normaliseVal(operand)) {
-          // use the reboxed value, not the block arg when re-creating the loop:
+          // Use the reboxed value, not the block arg when re-creating the loop.
+          // TODO: should we check that the operand dominates the loop?


Did I understand this correctly?

operand is an operand to the array indexing operation so it must dominate it. In the normal case it will also dominate the loop (the declare and reboxing are generated in the function entry block). But one could write valid IR where the reboxing does not dominate the outer loop so we ought to handle that case. Nice spot!

Yes, it is the operand of array_coor/coordinate_of that would normally also dominate the loop, but this is not guaranteed. I am glad that we are on the same page. I will try to create a reproducer in FIR and fix it in a separate check-in.

Thank you for the review!

@vzakhari is this issue fixed? If not, do you have a test that you could share?

@vzakhari I'm looking into the "check that the operand dominates the loop", but I'm struggling to come up with some FIR that is valid and shows this issue. Did you get somewhere on a reproducer?

I think this may still be a problem. I just hand-modified the first case from loop-versioning.fir:

module { func.func @sum1d(%arg0: !fir.box<!fir.array<?xf64>> {fir.bindc_name = "a"}, %arg1: !fir.ref<i32> {fir.bindc_name = "n"}) { %decl = fir.declare %arg0 {uniq_name = "a"} : (!fir.box<!fir.array<?xf64>>) -> !fir.box<!fir.array<?xf64>> %0 = fir.alloca i32 {bindc_name = "i", uniq_name = "_QMmoduleFsum1dEi"} %1 = fir.alloca f64 {bindc_name = "sum", uniq_name = "_QMmoduleFsum1dEsum"} %cst = arith.constant 0.000000e+00 : f64 fir.store %cst to %1 : !fir.ref<f64> %c1_i32 = arith.constant 1 : i32 %2 = fir.convert %c1_i32 : (i32) -> index %3 = fir.load %arg1 : !fir.ref<i32> %4 = fir.convert %3 : (i32) -> index %c1 = arith.constant 1 : index %5 = fir.convert %2 : (index) -> i32 %6:2 = fir.do_loop %arg2 = %2 to %4 step %c1 iter_args(%arg3 = %5) -> (index, i32) { %rebox = fir.rebox %decl : (!fir.box<!fir.array<?xf64>>) -> !fir.box<!fir.array<?xf64>> fir.store %arg3 to %0 : !fir.ref<i32> %7 = fir.load %1 : !fir.ref<f64> %8 = fir.load %0 : !fir.ref<i32> %9 = fir.convert %8 : (i32) -> i64 %c1_i64 = arith.constant 1 : i64 %10 = arith.subi %9, %c1_i64 : i64 %11 = fir.coordinate_of %rebox, %10 : (!fir.box<!fir.array<?xf64>>, i64) -> !fir.ref<f64> %12 = fir.load %11 : !fir.ref<f64> %13 = arith.addf %7, %12 fastmath<contract> : f64 fir.store %13 to %1 : !fir.ref<f64> %14 = arith.addi %arg2, %c1 : index %15 = fir.convert %c1 : (index) -> i32 %16 = fir.load %0 : !fir.ref<i32> %17 = arith.addi %16, %15 : i32 fir.result %14, %17 : index, i32 } fir.store %6#1 to %0 : !fir.ref<i32> return } }

It fails with an assertion: Assertion changed && "Expected operations to have changed"' failed.`, but I am not sure what it means exactly.

If I understand it correctly, as soon as the loop is proven to be safe to multiversion, we will transform it and we will use %rebox for generating the contiguity check before the loop, while it is defined inside the loop.

Thanks, I'll do some debugging. The assert basically means "we didn't find anything in the loop to update", which of course means versioning didn't achieve anything.

We should identify this case and just say "nope, not doing that".

vzakhari requested review from tblah and Leporacanthicus September 8, 2023 01:02

vzakhari requested a review from a team as a code owner September 8, 2023 01:02

github-actions bot added the flang Flang issues not falling into any other category label Sep 8, 2023

tblah approved these changes Sep 8, 2023

View reviewed changes

vzakhari merged commit 7beb65a into llvm:main Sep 8, 2023

michaelrj-google mentioned this pull request Sep 12, 2023

[libc] Move long double table option to new config #66151

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang] Fixed LoopVersioning for array slices. #65703

[flang] Fixed LoopVersioning for array slices. #65703

vzakhari commented Sep 8, 2023

tblah left a comment

tblah Sep 8, 2023

vzakhari Sep 8, 2023

kiranchandramohan Oct 5, 2023

Leporacanthicus Oct 5, 2023

vzakhari Oct 6, 2023

Leporacanthicus Oct 6, 2023

[flang] Fixed LoopVersioning for array slices. #65703

[flang] Fixed LoopVersioning for array slices. #65703

Conversation

vzakhari commented Sep 8, 2023

tblah left a comment

Choose a reason for hiding this comment

tblah Sep 8, 2023

Choose a reason for hiding this comment

vzakhari Sep 8, 2023

Choose a reason for hiding this comment

kiranchandramohan Oct 5, 2023

Choose a reason for hiding this comment

Leporacanthicus Oct 5, 2023

Choose a reason for hiding this comment

vzakhari Oct 6, 2023

Choose a reason for hiding this comment

Leporacanthicus Oct 6, 2023

Choose a reason for hiding this comment