Skill groupby attrs #351

jsmariegaard · 2023-12-19T20:57:28Z

Extracted by="attrs:gtype" part of #331

e.g. "attrs:gtype" or "attrs:DA" (to distinguish between assimilation and validation stations)

jsmariegaard · 2023-12-19T22:51:27Z

The to_dataframe() method returned object dtypes which I have now changed to category. This however leads to problems with the groupby specifically for cc.mean_skill() in multiple variable cases. I tried different things including setting observed=True, but that gives problems in gridded_skill() (where empty bins makes sense). I think that the actual problem is that the "default" by that mean_skill() passed on to skill() is ["model", "observation", "variable"] even though an observation can only have one variable. So I guess the by should instead be ["model", "observation"] and then just add variable afterwards. Maybe we should even check that observation and variable do not both occur in the by?

ecomodeller · 2023-12-21T14:28:08Z

Very useful functionality! 👍

Here is a snippet of a slightly incomplete example, where I have added attrs to 2(3) observations.

When the attrs is absent, the default is now to exclude it from the skill table, but by setting observed=True

...
>>> o1 = ms.PointObservation('HKNA_Hm0.dfs0', attrs={"use": "calibration"})
>>> o2 = ms.PointObservation("eur_Hm0.dfs0",   attrs={"use": "validation"})
>>> o3 = ms.TrackObservation("Alti_c2_Dutch.dfs0")
>>> cc = ms.match(obs=[o1, o2, o3], mod=[mr1, mr2])
>>> cc.skill(by=("model","attrs:use"), observed=False).round(2)
                     n  bias  rmse  urmse   mae    cc    si    r2
model use
SW_1  False        113 -0.00  0.35   0.35  0.29  0.97  0.13  0.90
      calibration  386 -0.19  0.35   0.29  0.25  0.97  0.09  0.91
      validation    67 -0.07  0.22   0.21  0.19  0.97  0.08  0.93
SW_2  False        113  0.08  0.43   0.42  0.36  0.97  0.15  0.85
      calibration  386 -0.10  0.29   0.28  0.21  0.97  0.09  0.93
      validation    67 -0.00  0.23   0.23  0.20  0.97  0.09  0.93
>>> cc.skill(by=("model","attrs:use")).round(2)
                     n  bias  rmse  urmse   mae    cc    si    r2
model use
SW_1  calibration  386 -0.19  0.35   0.29  0.25  0.97  0.09  0.91
      validation    67 -0.07  0.22   0.21  0.19  0.97  0.08  0.93
SW_2  calibration  386 -0.10  0.29   0.28  0.21  0.97  0.09  0.93
      validation    67 -0.00  0.23   0.23  0.20  0.97  0.09  0.93

…into skill-grouby-attrs

jsmariegaard · 2023-12-21T22:20:07Z

Finally managed to do the merge 😬that was not easy

ecomodeller · 2024-01-04T12:14:37Z

tests/test_comparercollection.py

+def test_skill_by_attrs_gtype(cc):
+    sk = cc.skill(by="attrs:gtype")
+    assert len(sk) == 2
+    assert sk.data.index[0] == "point"


It seems overly specific to assert that the index is sorted in this order.

Isn't it enough to verify that:

assert "point" in sk.index assert "track" in sk.index

true - will fix in next PR which will be on sorting

jsmariegaard added 6 commits December 19, 2023 21:11

optionally include attrs in to_dataframe and _attrs_keys_in_by

fea7e0a

rename to group and prepare for x, y

a46b24c

temporary disable astype(category)

22cb543

only if str

a42b5a0

reintroduce astype("category") and remove resulting empty groups 🤔

c464769

type: ignore

b093f96

jsmariegaard mentioned this pull request Dec 19, 2023

Improve by argument in skill() #331

Closed

ecomodeller added 2 commits December 21, 2023 15:17

Add option to include observeed categories

3b874ad

boolean logic backwards

fec47bf

ecomodeller force-pushed the skill-grouby-attrs branch from af9fbe2 to 3b874ad Compare December 21, 2023 15:35

ecomodeller and others added 4 commits December 21, 2023 16:37

Ignore ruff

37c47f7

Merge branch 'skill-grouby-attrs' of https://github.com/DHI/modelskill …

11bc69a

…into skill-grouby-attrs

Merge branch 'main' into skill-grouby-attrs

568943d

reverse observed

ee3e029

jsmariegaard added 6 commits January 3, 2024 06:58

Merge branch 'main' into skill-grouby-attrs

f154cb8

Merge branch 'main' into skill-grouby-attrs

a2c8b4d

test_skill_by_attrs_gtype and test_skill_by_attrs_gtype_and_mod

581690d

docstring

ee1fff1

test_skill_by_attrs_int, test_skill_by_attrs_observed

1411245

TODO on sort=False

af27dbe

jsmariegaard marked this pull request as ready for review January 3, 2024 16:47

jsmariegaard added 6 commits January 4, 2024 06:20

Merge branch 'main' into skill-grouby-attrs

00376b2

observed=False

da3669b

mod_idx

354a52e

docstrings and variable naming

e93bba1

improved docstrings including consistent kwargs

3506819

move deprecated methods to the bottom, and other clean-up

bd960fb

jsmariegaard added 10 commits January 4, 2024 08:21

idx instead of id

9fbf00a

better docstrings

959023a

move load/save and deprecated functionality to bottom of file

43209c4

rename residual to _residual (for now)

597b2fa

IdOrNameTypes renamed to IdxOrNameTypes

5c3afff

Improved docstrings, correct return types, lists in arguments

d16ea04

rename to_dataframe to _to_long_dataframe

56c7cd4

to_dataframe removed, use cmp.data.to_dataframe instead

5afaa51

rename

a235c92

cmp.to_dataframe removed

256a344

jsmariegaard requested a review from ecomodeller January 4, 2024 11:41

ecomodeller reviewed Jan 4, 2024

View reviewed changes

ecomodeller approved these changes Jan 4, 2024

View reviewed changes

jsmariegaard merged commit 2e495c2 into main Jan 4, 2024

jsmariegaard deleted the skill-grouby-attrs branch January 4, 2024 12:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skill groupby attrs #351

Skill groupby attrs #351

jsmariegaard commented Dec 19, 2023 •

edited

Loading

jsmariegaard commented Dec 19, 2023

ecomodeller commented Dec 21, 2023

jsmariegaard commented Dec 21, 2023

ecomodeller Jan 4, 2024

jsmariegaard Jan 4, 2024

Skill groupby attrs #351

Skill groupby attrs #351

Conversation

jsmariegaard commented Dec 19, 2023 • edited Loading

jsmariegaard commented Dec 19, 2023

ecomodeller commented Dec 21, 2023

jsmariegaard commented Dec 21, 2023

ecomodeller Jan 4, 2024

Choose a reason for hiding this comment

jsmariegaard Jan 4, 2024

Choose a reason for hiding this comment

jsmariegaard commented Dec 19, 2023 •

edited

Loading