Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup pace stencils testing #84

Merged
merged 15 commits into from
Jan 6, 2022
Merged

Conversation

elynnwu
Copy link
Collaborator

@elynnwu elynnwu commented Dec 22, 2021

Purpose

When removing global access, we moved some functions (mostly inside grid.py into pace.stencils.testing). In this PR, we move those functions to more fitting location.

Code changes:

  • Made helper.py inside pace.util.grid which contains DampingCoefficients, *GridData, and quantity_wrap that were previously inside grid.py
  • As stated in grid.py, delete the axis_offsets routine in favor of grid_indexing.axis_offsets: replace all references to this

Checklist

Before submitting this PR, please make sure:

  • You have followed the coding standards guidelines established at Code Review Checklist.
  • Docstrings and type hints are added to new and updated routines, as appropriate
  • All relevant documentation has been updated or added (e.g. README, CONTRIBUTING docs)
  • Unit tests are added or updated for non-stencil code changes

@elynnwu elynnwu requested a review from mcgibbon December 22, 2021 17:17
Comment on lines 154 to 177
edge_n = utils.make_storage_data(
metric_terms.edge_n.data,
(shape[0],),
axis=0,
backend=metric_terms.edge_n.gt4py_backend,
)
edge_s = utils.make_storage_data(
metric_terms.edge_s.data,
(shape[0],),
axis=0,
backend=metric_terms.edge_s.gt4py_backend,
)
edge_e = utils.make_storage_data(
metric_terms.edge_e.data,
(1, shape[1]),
axis=1,
backend=metric_terms.edge_e.gt4py_backend,
)
edge_w = utils.make_storage_data(
metric_terms.edge_w.data,
(1, shape[1]),
axis=1,
backend=metric_terms.edge_w.gt4py_backend,
)
Copy link
Collaborator

@mcgibbon mcgibbon Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use metric_terms.edge_n.storage here? Do we need make_storage_data from dsl? If it is not possible to convert the quantity metric_terms.edge_w into a storage using only that object, it feels like there's something wrong with quantity.storage or with the quantity itself (which should be 2D with a singleton dimension when we want that same behavior for the storage).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. There was an issue with edge_e and edge_w having dimension [1, Y_INTERFACE_DIM] with (True, True, False) mask, which kept these from calling .storage directly. Added a UNIT_X_DIM dimension in Metric Terms to create these fields properly in the first place.

Comment on lines 209 to 217
ak = metric_terms.ak.data
bk = metric_terms.bk.data
# TODO fix <Quantity>.storage mask for FieldK
ak = utils.make_storage_data(
ak, ak.shape, len(ak.shape) * (0,), backend=metric_terms.ak.gt4py_backend
)
bk = utils.make_storage_data(
bk, bk.shape, len(bk.shape) * (0,), backend=metric_terms.ak.gt4py_backend
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, can we do this without make_storage_data?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added an optional mask input to Quantity to have mask (True, True, False), then .storage works as expected.

Comment on lines 5 to 7
from pace.dsl import gt4py_utils as utils
from pace.dsl.stencil import GridIndexing
from pace.dsl.typing import FloatFieldI, FloatFieldIJ
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number of imports from pace.dsl are setting off alarm bells for me. Is it possible to make storages in this file without gt4py_utils? quantity_wrap is only used in one file, dyn_core.py in fv3core - can it be put there instead?

The typing is a harder question to answer and we can iterate on after. If we want both util and dsl to be able to depend on gt4py, util only doing so for storages and nothing else, then the type hints for storages will need to live in util and get imported into dsl. However, @jdahm has been telling me FloatField is not technically accurate for storages outside of stencils, so these hints themselves might be overhauled (keeping FloatField hints in dsl for use in stencils, and making a new Storage protocol type hint for use outside of stencils like here).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved quantity_wrap to a private function in dyn_core. Let's discuss the typing and that can go into the circular dependency PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be addressed in this PR, but we also might consider putting all the dataclass containers in a file named to describe that that's what's inside it. I'm not sure what to expect from grid.helper.

from pace.util import X_DIM, Y_DIM, Z_DIM, Z_INTERFACE_DIM
from pace.util.grid import DampingCoefficients, GridData, quantity_wrap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears this is actually the only place we use quantity_wrap. The function itself is a pretty private-looking one, that always sets units to "unknown" - does it make sense to put that function directly in this file with a _ starting its name?

@@ -84,6 +85,7 @@ def __init__(
self.LON_OR_LAT_DIM: 2,
self.TILE_DIM: 6,
self.CARTESIAN_DIM: 3,
self.UNIT_X_DIM: 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use pace.util.X_DIM instead? I don't think there's a need to differentiate UNIT_X_DIM from X_DIM - we similarly don't differentiate X_DIM_COMPUTE from X_DIM_FULL. It's enough that it indicates which axis the dimension is along (which is X_DIM).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to pace.util.X_DIM

@@ -259,6 +259,7 @@ def __init__(
origin: Sequence[int] = None,
extent: Sequence[int] = None,
gt4py_backend: Union[str, None] = None,
mask: Optional[Tuple[bool, bool, bool]] = None,
Copy link
Collaborator

@mcgibbon mcgibbon Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to programmatically generate the mask, without taking it as an input. Something like

mask = [
    any(dim in coord_dims for dim in dims)
    for coord_dims in X_DIMS, Y_DIMS, Z_DIMS
]

Copy link
Collaborator

@mcgibbon mcgibbon Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to confirm this actually works with someone on the backend (@jdahm?) - my understanding is that mask is specifically for x, y, and z dimensions in that order, always, regardless of the quantity dimension order.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a big help if you could add unit tests to pace-util making sure 2D storages continue to work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcgibbon Yeah, I think we should be able to automatically generate those from a quantity like you wrote.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now updated to detect the mask inside Quantity.

Copy link
Collaborator

@mcgibbon mcgibbon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small change comments for the updates to quantity/allocator, let me know when you've updated and it should be a quick final approve.

from pace.stencils.testing.parallel_translate import ParallelTranslate2Py
from pace.util.grid import DriverGridData
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For later PR: DriverGridData seems like a name that should only exist in pace.driver, so there's something going on here. This might be something you've already identified or we've already talked about.

After looking at it for a while, I think it's that ApplyPhysics2Dycore and its dependency AGrid2DGridPhysics should live in pace.driver, where they can't be accessed by physics or dycore. This would help make sure we don't introduce dependencies between the physics and dycore data that isn't mediated by pace.driver, so all the mediation/translation can be found by searching through that package.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, those need to be moved outside physics and into driver.

Comment on lines -63 to +79
def ones(self, dims: Sequence[str], units: str, dtype: type = float):
def ones(
self,
dims: Sequence[str],
units: str,
dtype: type = float,
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my information, do you know why these got swapped? There's no change to the content of the call signature, as far as I can see.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I had originally added an extra argument for mask, that's when the reformat happened. I'm not sure why it didn't get reformatted back, very odd. Maybe it was okay either way for this length...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, I think it might have different thresholds for making something multi-line and making something one-line.

for coord_dims in [X_DIMS, Y_DIMS, Z_DIMS]
]
)
spatial_dims = [i for j in [X_DIMS, Y_DIMS, Z_DIMS] for i in j]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually tested the code you've written does in fact get the correct spatial_dims, but I still cannot tell from reading this Python list comprehension how it works. This should be more straightforward:

Suggested change
spatial_dims = [i for j in [X_DIMS, Y_DIMS, Z_DIMS] for i in j]
spatial_dims = X_DIMS + Y_DIMS + Z_DIMS

try:
data = allocator(shape, dtype=dtype, default_origin=origin)
data = allocator(shape, dtype=dtype, mask=mask)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the default_origin removed intentionally?

Suggested change
data = allocator(shape, dtype=dtype, mask=mask)
data = allocator(shape, dtype=dtype, default_origin=origin, mask=mask)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, it should not have been deleted.

Comment on lines 287 to 292
mask = tuple(
[
any(dim in coord_dims for dim in dims)
for coord_dims in [constants.X_DIMS, constants.Y_DIMS, constants.Z_DIMS]
]
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional nit: You could put this inside the if block for gt4py_backend is not None, so it only gets computed when needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done!

Comment on lines 407 to 408
if len(extra_dims) > 0 or not self._dims:
mask = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of passing dims silently through an object attribute, can you put this if statement where mask is computed so the passed value for mask will be None in this case, and then remove the attribute self._dims? This should make it clearer when reading the code that computes the mask that we only compute/use the mask when there are no non-spatial dimensions.

Comment on lines 403 to 406
spatial_dims = [
i for j in [constants.X_DIMS, constants.Y_DIMS, constants.Z_DIMS] for i in j
]
extra_dims = [i for i in self._dims if i not in spatial_dims]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code to compute extra_dims is duplicated. Instead of pulling it into a function, can you add a pace.util.SPATIAL_DIMS constant equal to X_DIMS + Y_DIMS + Z_DIMS, defined in the same file where these constants are defined? Then the one-liner for extra_dims is simple and fine to duplicate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pace.util.SPATIAL_DIMS

Copy link
Collaborator

@mcgibbon mcgibbon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Really nice! Two code suggestions to merge and one to probably ignore, then LGTM.

Comment on lines -63 to +79
def ones(self, dims: Sequence[str], units: str, dtype: type = float):
def ones(
self,
dims: Sequence[str],
units: str,
dtype: type = float,
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, I think it might have different thresholds for making something multi-line and making something one-line.

@@ -284,6 +284,7 @@ def __init__(
else:
extent = tuple(extent)

self._dims = dims
if isinstance(data, (int, float, list)):
data = np.asarray(data)
elif gt4py is not None and isinstance(data, gt4py.storage.storage.Storage):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might suggest not merging this, but pointing it out so you're aware - there's a bug here where if someone passes a list of floats to initialize the quantity with a z-dimension, and includes a gt4py backend, the storage attribute won't work when you'd expect it should. Since we never initialize quantities this way (perhaps the list option should be removed altogether), this is not an issue.

If you merge this suggestion it might cause unforeseen test failures, so it's really up to you.

Suggested change
elif gt4py is not None and isinstance(data, gt4py.storage.storage.Storage):
if gt4py is not None and isinstance(data, gt4py.storage.storage.Storage):

elynnwu and others added 3 commits January 5, 2022 16:10
Co-authored-by: Jeremy McGibbon <jeremym@allenai.org>
Co-authored-by: Jeremy McGibbon <jeremym@allenai.org>
@elynnwu elynnwu enabled auto-merge (squash) January 6, 2022 00:17
@elynnwu elynnwu merged commit be48fab into main Jan 6, 2022
@elynnwu elynnwu deleted the fix/clean-up-stencils-testing branch January 6, 2022 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants