Persist raw model data #261

ecomodeller · 2023-09-20T08:07:16Z

No description provided.

ryan-kipawa · 2023-09-20T11:49:17Z

modelskill/comparison/_comparison.py

@@ -462,7 +462,7 @@ def __init__(
        modeldata=None,
        max_model_gap: Optional[TimeDeltaTypes] = None,
        matched_data: Optional[xr.Dataset] = None,
-        raw_mod_data: Optional[Dict[str, pd.DataFrame]] = None,
+        raw_mod_data: Optional[Dict[str, pd.Series]] = None,


Do we need to update the docstring to match? Or is that automated somehow?

Not automated, but Copilot is very helpful in creating new ones.

ryan-kipawa · 2023-09-20T12:01:03Z

modelskill/comparison/_collection.py

-    def save(self, fn: Union[str, Path]) -> None:
-        # save to file in netcdf format using xarray
-        # save each comparer to a netcdf and pack them into a zip file
+    def save(self, filename: Union[str, Path]) -> None:


Would it be useful to add an optional argument to avoid accidentally overwriting existing files? Like wrapping the 'mode' parameter from ZipFIle? Default could be kept as 'w' or 'x' depending on which is more convenient

Sure, I haven't considered that possibility. If users expect a mode parameter, we should probably add it to all methods that writes files, e.g. mikeio.Dataset.to_dfs()

ryan-kipawa · 2023-09-20T12:34:26Z

tests/test_compare.py

@@ -324,3 +324,15 @@ def test_trackmodelresult_and_trackobservation_uses_model_name():
 def test_item_selection_items_are_unique():
    with pytest.raises(ValueError):
        ItemSelection(obs="foo", model=["foo", "bar"], aux=["baz"])
+
+
+def test_save_comparercollection(o1, o3, tmp_path):


Should this actually be in test_comparercollection.py?

Maybe, I suppose the entire test suite needs to be (re-)organized. I think it would make sense to split tests into unit tests and integration tests and end-to-end tests, or something along those lines.

ryan-kipawa

Looks pretty good, nice going.

jsmariegaard

Great work. I like the content related to the name of the PR.

I am not so sure about the change of behavior for compare() which now always returns a cc. I somehow appreciate the consistency, but to me, it was quite meaningful before: if compare one obs then I get one comparer, if I compare multiple obss then I get a collection. Makes sense. Let's discuss tomorrow :-)

But related to save/load functionality, I guess my only comment is: is "raw_" a special enough name that we will not get in trouble if people accidentally named their station "raw_stn1" or similar? I would suggest "raw" or something else that starts with an underscore to indicate that is an internal structure.

I guess I also agree with @rywm-dhi on the overwrite functionality...

ecomodeller · 2023-09-21T04:21:31Z

I am not so sure about the change of behavior for compare() which now always returns a cc. I somehow appreciate the consistency, but to me, it was quite meaningful before: if compare one obs then I get one comparer, if I compare multiple obss then I get a collection. Makes sense. Let's discuss tomorrow :-)

The main reason I changed compare to return a ComparerCollection, is that multiple return types causes editors like VS Code to misinterpret the returned object, which you can expect writing a script with no active session, but also in a notebook when it should be able to inspect the actual object☹️. Since both the Comparer and the ComparerCollection has a save method, VS Code presents the wrong auto-completion and documentation, which is not helpful.

The compare function should be the main entry-point for people starting out with the library and multiple return types is confusing.

ecomodeller · 2023-09-21T04:28:40Z

But related to save/load functionality, I guess my only comment is: is "raw_" a special enough name that we will not get in trouble if people accidentally named their station "raw_stn1" or similar? I would suggest "raw" or something else that starts with an underscore to indicate that is an internal structure.

I'll add an underscore at the beginning, but we could also use NetCDF groups and split the raw data into separate datasets.
The structure of this file is a fairly important decision, if we want to be able to support loading files across versions of modelskill.

ecomodeller · 2023-09-21T05:31:25Z

I guess I also agree with @rywm-dhi on the overwrite functionality...

It seems like it could be useful, but after inspecting some other methods writing files in Python, checking if the file exists behavior seem to be very unusual, so I guess it is better to leave this up to the end user, they can add a path.exists(), since the default would be to overwrite anyway.

jsmariegaard

Let's go with this for now - and discuss later about return type of compare()

ecomodeller added 6 commits September 20, 2023 10:03

Persist raw_model_data

eaf7ba5

Correct type hints for rose

5434de1

Fix type hint

ee08b0c

Rename back

6628412

Use non-interactive plotting backend

30eca10

Imports at the top

032814e

ecomodeller changed the title ~~Persist_raw_model_data~~ Persist raw model data Sep 20, 2023

ecomodeller added 4 commits September 20, 2023 13:03

Always return ComparerCollection from ms.compare

b46080e

Notebook example

59af5e0

Docs

b106af1

Use filename as argument

118e725

ecomodeller marked this pull request as ready for review September 20, 2023 11:15

ecomodeller requested a review from ryan-kipawa September 20, 2023 11:21

ecomodeller added 2 commits September 20, 2023 13:28

Notebooks

f3e2139

Notebooks

c9f6757

ryan-kipawa reviewed Sep 20, 2023

View reviewed changes

ryan-kipawa approved these changes Sep 20, 2023

View reviewed changes

jsmariegaard reviewed Sep 20, 2023

View reviewed changes

ecomodeller added 2 commits September 21, 2023 05:58

Fix track notebook

16d34c2

Docstring

c6dd641

Prefix with underscore

53e94ba

jsmariegaard approved these changes Sep 22, 2023

View reviewed changes

Merge branch 'main' into persist_raw_model_data

24ff4d3

ecomodeller merged commit 11bf57c into main Sep 22, 2023

ecomodeller deleted the persist_raw_model_data branch September 22, 2023 09:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persist raw model data #261

Persist raw model data #261

ecomodeller commented Sep 20, 2023

ryan-kipawa Sep 20, 2023

ecomodeller Sep 20, 2023

ryan-kipawa Sep 20, 2023

ecomodeller Sep 21, 2023

ryan-kipawa Sep 20, 2023

ecomodeller Sep 20, 2023

ryan-kipawa left a comment

jsmariegaard left a comment

ecomodeller commented Sep 21, 2023

ecomodeller commented Sep 21, 2023

ecomodeller commented Sep 21, 2023 •

edited

Loading

jsmariegaard left a comment

Persist raw model data #261

Persist raw model data #261

Conversation

ecomodeller commented Sep 20, 2023

ryan-kipawa Sep 20, 2023

Choose a reason for hiding this comment

ecomodeller Sep 20, 2023

Choose a reason for hiding this comment

ryan-kipawa Sep 20, 2023

Choose a reason for hiding this comment

ecomodeller Sep 21, 2023

Choose a reason for hiding this comment

ryan-kipawa Sep 20, 2023

Choose a reason for hiding this comment

ecomodeller Sep 20, 2023

Choose a reason for hiding this comment

ryan-kipawa left a comment

Choose a reason for hiding this comment

jsmariegaard left a comment

Choose a reason for hiding this comment

ecomodeller commented Sep 21, 2023

ecomodeller commented Sep 21, 2023

ecomodeller commented Sep 21, 2023 • edited Loading

jsmariegaard left a comment

Choose a reason for hiding this comment

ecomodeller commented Sep 21, 2023 •

edited

Loading