fix missed exploration of edges in constant propagation #1635

luigifusco · 2024-09-02T15:27:22Z

There is a bug related to the missed exploration of interstate edges during constant propagation in case a loop body has a conditional assignment. The reverse DFS yields parent-node pairs and analyzes only the edge connecting the two. The DFS will yield a certain node only once, while the assumption in the code is that the uniqueness is enforced on the parent-node pair. This results in only one outgoing interstate edge per body state being visited, leading to mistakes in the common case of conditional assignments (which result in two outgoing edges performing different assignments). If the visited edge does not perform an assignment or assigns the initialization value, the symbol will be wrongly interpreted as a constant and replaced in downstream states.

A short reproducing example is:

N = dace.symbol('N', dace.int64)

@dace.program
def program(in_arr: dace.bool[N], arr: dace.bool[N]):
    check = False
    for i in range(N):
        if in_arr[i]:
            check = True
        else:
            check = False
    for i in dace.map[0:N]:
        arr[i] = check

sdfg = program.to_sdfg(simplify=True)
sdfg.save('bug.sdfg')

# "arr[i] = check" will be replaced by "arr[i] = False"

The fix makes sure all interstate edges are visited at least once.

phschaad

LGTM, but please add the reproducer or a derivative thereof as a regression test to our CI test suite. Make sure it fails (reproduces) without the fix and passes with it. Thanks!

tbennun · 2024-09-03T06:44:34Z

I second @phschaad's comment about the test. I also think that there may be issues with this code concerning loop guards (which will have another incoming edge from the end/latch of a loop, thereby adding more new_symbols that are not really new yet).

Also, how do we know that this process works for multiple chained input edges going into the current node? This only propagates one step backwards. Ideally every state should traverse all the way back to a root node (or a dominator).

luigifusco · 2024-09-03T09:44:03Z

This version ignores unvisited states but iterates until no change is made to the symbol definitions. Having more than one pass is the correct way of doing this. In general a loop body needs to be visited at least twice (it needs to propagate its definitions into itself).

An optimization of my code would be to explicitly detect a loop (there is an edge with yet undefined result at its source) -> perform a sub-constant-propagation on the loop body only (going recursively if nested loops are found) and then continue with the normal propagation. This however would complicate the code a lot more (requiring changes into the topological sort function to start from arbitrary states). Do you think that is needed?

tbennun · 2024-09-03T15:08:47Z

Maybe the loop-specific behavior is not needed, especially when control flow blocks are raised from the code.
We need to take a closer look at the algorithm before approving. I want to see if the algorithm is sound.
Do you know how many passes are done on an SDFG? Is there a considerable slowdown on larger graphs (@acalotoiu maybe check that)?

Thanks for changing it!

luigifusco · 2024-09-03T20:27:22Z

checking velocity_tendencies_simplified it completes in 5 iterations taking about 2.3 seconds total. This number should be related to the maximum loop depth (you need multiple passes to bring the definition/correction out).

tbennun · 2024-09-04T14:08:47Z

checking velocity_tendencies_simplified it completes in 5 iterations taking about 2.3 seconds total. This number should be related to the maximum loop depth (you need multiple passes to bring the definition/correction out).

I have no idea what this means with relation to the existing code. How much longer is it than before?

luigifusco · 2024-09-04T15:00:43Z

Testing a bit better. Before ~2.1s, now ~2.6s. I only change the collect_constants part. The way this scales depending on the number of iterations is not linear as successive iterations will perform fewer dictionary operations. Considering this function only, 1 iter -> 1.7s, 5 iter -> 2.4s, 10 iter -> 3s. The SDFG is part of ICON and is made of 291 nodes and 385 edges.

dace/transformation/passes/constant_propagation.py

fix missed exploration of edges in constant propagation

dd62030

luigifusco force-pushed the constant_propagation_fix branch from 99c1f78 to dd62030 Compare September 2, 2024 15:29

luigifusco requested review from phschaad and acalotoiu September 2, 2024 15:30

phschaad requested changes Sep 3, 2024

View reviewed changes

change to multi iteration algorithm

2daf4ad

add conditional in loop constant propagation assignment test

79372ae

tbennun requested changes Sep 6, 2024

View reviewed changes

dace/transformation/passes/constant_propagation.py Outdated Show resolved Hide resolved

Update dace/transformation/passes/constant_propagation.py

b07e4e4

tbennun merged commit 7210cb6 into master Sep 8, 2024
0 of 10 checks passed

tbennun deleted the constant_propagation_fix branch September 8, 2024 00:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix missed exploration of edges in constant propagation #1635

fix missed exploration of edges in constant propagation #1635

luigifusco commented Sep 2, 2024

phschaad left a comment

tbennun commented Sep 3, 2024

luigifusco commented Sep 3, 2024

tbennun commented Sep 3, 2024

luigifusco commented Sep 3, 2024

tbennun commented Sep 4, 2024

luigifusco commented Sep 4, 2024

fix missed exploration of edges in constant propagation #1635

fix missed exploration of edges in constant propagation #1635

Conversation

luigifusco commented Sep 2, 2024

phschaad left a comment

Choose a reason for hiding this comment

tbennun commented Sep 3, 2024

luigifusco commented Sep 3, 2024

tbennun commented Sep 3, 2024

luigifusco commented Sep 3, 2024

tbennun commented Sep 4, 2024

luigifusco commented Sep 4, 2024