-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor of GT4Py-DaCe bridge to expose all control-flow to Dace #53
Comments
Below is a small example showing the above problem from gt4py.cartesian.gtscript import (
computation,
interval,
PARALLEL,
)
from ndsl.boilerplate import get_factories_single_tile_orchestrated_cpu
from ndsl.constants import X_DIM, Y_DIM, Z_DIM
from ndsl.dsl.typing import FloatField
from ndsl import StencilFactory, orchestrate
import numpy as np
domain = (3, 3, 4)
stcil_fctry, ijk_qty_fctry = get_factories_single_tile_orchestrated_cpu(domain[0], domain[1], domain[2], 0)
def small_conditional(
in_field: FloatField,
out_field: FloatField
):
with computation(PARALLEL), interval(...):
if in_field > 5.0 and in_field < 20:
out_field = in_field
class DaCeGT4Py_Bridge:
def __init__(self, stencil_factory: StencilFactory):
orchestrate(
obj=self,
config=stencil_factory.config.dace_config)
self.stencil = stcil_fctry.from_dims_halo(
func=small_conditional,
compute_dims=[X_DIM, Y_DIM, Z_DIM],
)
def __call__(self, in_field: FloatField, out_field: FloatField):
self.stencil(in_field, out_field)
if __name__ == "__main__":
I = np.arange(domain[0]*domain[1]*domain[2], dtype=np.float64).reshape(domain)
O = np.zeros(domain)
bridge = DaCeGT4Py_Bridge(stcil_fctry)
bridge(I, O)
print(f"Input:\n{I}\n")
print(f"Output:\n{O}\n") This generate the following SDFG (under where the tasklet code carries the conditional: mask_140660478452496_gen_0: dace.bool_
mask_140660478452496_gen_0 = ((gtIN__in_field > dace.float64(5.0)) and (gtIN__in_field < dace.float64(dace.int64(20))))
if mask_140660478452496_gen_0:
gtOUT__out_field = gtIN__in_field We want this conditional to be extracted before the tasklet. See |
Known native construct that are folded inside the Tasklet instead of described to DaCe:
GT4Py-DaCe uses an "expansion" mechanism. The stencils are |
`MaskInlining` will be a key feature of the improved gt4py / DaCe bridge as envisioned in GEOS-ESM/NDSL#53, which will propagate more control flow elements from gt4py to DaCe. We thus want to make sure if `MaskInlining` itself is posing any problems. We tracked the git history of this explicit exclude back to PR #300, which not only added the npir backend, but also refactored how pipeline stages are defined. Previous to this PR optimization passes to be specified explicitly and the PR changed this to an exclude list. It looks like the `MaskInlining` pass was added after the list of optimization passes was formed and thus excluded (to keep the list constant). From this history we thus don't expect `MaskInlining` to break anything. Co-authored-by: Roman Cattaneo <>
…on and other cleanups (#1630) Being new to the codebase and in preparation of the [GT4Py-DaCe bridge refactor](GEOS-ESM/NDSL#53) I was reading a lot of code. Some things that caught my eye are summarized in this PR. Bigger changes include 1. Moving `daceir.py` into the `dace/` backend folder. This follows the convention of all other backends who have their IR inside the backend folder. (last commit) 2. Readability improvements in `gtir_to_oir.py` (the "Cleanup visit_*" commits): In my opinion, there's no point in shortening "statement" to "stmt". I wouldn't go as far as renaming the classes. For the renamed local variables, I think this adds a lot to readability. 3. `visit_While` in `gtir_to_oir.py` was having an extra mask statement around the `while` loop (in case a mask was defined. I inlined the potential `mask` with the condition of `while` loop. 4. In `gtir_to_oir.py` don't translate every body statement into a single `MaskStmt`. Instead, create one (or two incase of `else`) `MaskStmt` with multiple statements in the `body` (which is already typed as a list of statements). Tested locally by running the test suite. All tests passing. Co-authored-by: Roman Cattaneo <>
Latest work is in https://github.com/romanc/gt4py/romanc/bridge-on-top-of-cleanups, which is re-based on top of the cleanups in PR GridTools/gt4py#1724 and should only contain changes necessary for the new bridge and no cleanups anymore. The new bridge is currently pending on the following two DaCe issues:
Working around those two, we still (only sometimes) have an issue with writes (in k) inside a |
## Description This PR is split off the work for the new GT4Py - DaCe bridge, which should allow to expose control flow statements (`if` and `while`) to DaCe to better use DaCe's analytics capabilities. This PR is concerned with adding type hints and generally improving code readability. Main parts are - `daceir_builder.py`: early returns and renamed variable - `sdfg_builder.py`: type hints and early returns - `tasklet_codegen.py`: type hints and early returns `TaskletCodegen` was given `sdfg_ctx`, which wasn't used. That parameter was thus removed. Parent issue: GEOS-ESM/NDSL#53 ## Requirements - [x] All fixes and/or new features come with corresponding tests. Assumed to be covered by existing tests. - [ ] Important design decisions have been documented in the approriate ADR inside the [docs/development/ADRs/](docs/development/ADRs/Index.md) folder. N/A --------- Co-authored-by: Roman Cattaneo <>
Latest work is still in https://github.com/romanc/gt4py/romanc/bridge-on-top-of-cleanups, which can be rebased on top of mainline gt4py once GridTools/gt4py#1752 to get a clearer picture. Main ideas in that branch are the following
|
…ble k offsets (#1882) <!-- Delete this comment and add a proper description of the changes contained in this PR. The text here will be used in the commit message since the approved PRs are always squash-merged. The preferred format is: - PR Title: <type>[<scope>]: <one-line-summary> <type>: - build: Changes that affect the build system or external dependencies - ci: Changes to our CI configuration files and scripts - docs: Documentation only changes - feat: A new feature - fix: A bug fix - perf: A code change that improves performance - refactor: A code change that neither fixes a bug nor adds a feature - style: Changes that do not affect the meaning of the code - test: Adding missing tests or correcting existing tests <scope>: cartesian | eve | next | storage # ONLY if changes are limited to a specific subsystem - PR Description: Description of the main changes with links to appropriate issues/documents/references/... --> ## Description We figured that DaCe backends are currently missing support for casting in variable k offsets. This PR - adds a codegen test with a cast in a variable k offset - adds a node validator for the DaCe backends complaining about missing for support. - adds an `xfail` test for the node validator This should be fixed down the road. Here's the issue #1881 to keep track. The PR also has two smaller and unrelated commits - 741c448 increases test coverage with another codgen test that has a couple of read after write access patterns which were breaking the "new bridge" (see GEOS-ESM/NDSL#53). - e98ddc5 just forwards all keyword arguments when visiting offsets. I don't think this was a problem until now, but it's best practice to forward everything. ## Requirements - [x] All fixes and/or new features come with corresponding tests. - [ ] Important design decisions have been documented in the appropriate ADR inside the [docs/development/ADRs/](docs/development/ADRs/Index.md) folder. N/A --------- Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com> Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
## Description In preparation for PR #1894, pull out some refactors and cleanups. Notable in this PR are the changes to `src/gt4py/cartesian/gtc/dace/oir_to_dace.py` - visit `stencil.vertical_loops` directly instead of calling `generic_visit` (simplification since there's nothing else to visit) - rename library nodes from `f"{sdfg_name}_computation_{id(node)}"` to `f"{sdfg_name}_vloop_{counter}_{node.loop_order}_{id(node)}"`. This adds a bit more information (because `sdfg_name` is the same for all library nodes) and thus simplifies debugging workflows. Related issue: GEOS-ESM/NDSL#53 ## Requirements - [x] All fixes and/or new features come with corresponding tests. covered by existing test suite - [ ] Important design decisions have been documented in the appropriate ADR inside the [docs/development/ADRs/](docs/development/ADRs/Index.md) folder. N/A --------- Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
## Description This PR refactors the GT4Py/DaCe bridge to expose control flow elements (`if` statements and `while` loops) to DaCe. Previously, the whole contents of a vertical loop was put in one big Tasklet. With this PR, that Tasklet is broken apart in case control flow is found such that control flow is visible in the SDFG. This allows DaCe to better analyze code and will be crucial in future (within the current milestone) performance optimization work. The main ideas in this PR are the following 1. Introduce `oir.CodeBlock` to recursively break down `oir.HorizontalExecution`s into smaller pieces that are either code flow or evaluated in (smaller) Tasklets. 2. Introduce `dcir.Condition`and `dcir.WhileLoop` to represent if statements and while loops that are translated into SDFG states. We keep the current `dcir.MaskStmt` / `dcir.While` for if statements / while loops inside horizontal regions, which aren't yet exposed to DaCe (see #1900). 3. Add support for `if` statements and `while` loops in the state machine of `sdfg_builder.py` 4. We are breaking up vertical loops inside stencils in multiple Tasklets. It might thus happen that we write a "local" scalar in one Tasklet and read it in another Tasklet (downstream). We thus create output connectors for all scalar writes in a Tasklet and input connectors for all reads (unless previously written in the same Tasklet). 5. Memlets can't be generated per horizontal execution anymore and need to be more fine grained. `TaskletAccessInfoCollector` does this work for us, duplicating some logic in `AccessInfoCollector`. A refactor task has been logged to fix/re-evaluate this later. This PR depends on the following (downstream) DaCe fixes - spcl/dace#1954 - spcl/dace#1955 which have been merged by now. Follow-up issues - unrelated changes have been moved to #1895 - #1896 - #1898 - #1900 Related issue: GEOS-ESM/NDSL#53 ## Requirements - [x] All fixes and/or new features come with corresponding tests. Added new tests and increased coverage of horizontal regions with PRs #1807 and #1851. - [ ] Important design decisions have been documented in the appropriate ADR inside the [docs/development/ADRs/](docs/development/ADRs/Index.md) folder. Docs are [in our knowledge base](https://geos-esm.github.io/SMT-Nebulae/technical/backend/dace-bridge/) for now. Will be ported. --------- Co-authored-by: Roman Cattaneo <> Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Main chunk of this work was merged with PR GridTools/gt4py#1894. The PR pull this version of GT4Py into NDSL is NOAA-GFDL#119, which is expected to be merged in the coming days. Follow-up issues include
Given that, let's close this issue and treat potential fall-out (nothing big is expected since pace and pyFV3 test suites both pass) in follow-up issues. |
The current bridge between
gt4py.cartesian
andDaCe
uses a concept ofexpansion
for stencils. This means that stencils, once they got through the entire GT4Py workflow, are pickled and shoved down a library node. Then theexpansion
mechanism turns those into a single map/tasklet perHorizontal Computation
. This means that a lot of control-flow (mask, while loop, etc.) is hidden within from DaCe, leading to some optimization power being hidden.We need to undo this and describe everything back to DaCe.
Expansion code lives around
src/gt4py/cartesian/gtc/dace/expansion/*.py
in GT4PyParent: GEOS-ESM/SMT-Nebulae#31
NB: This is a requirement for DaCe AD future features to kick in
The text was updated successfully, but these errors were encountered: