-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new_metrics_validation_services #241
Conversation
Great - thanks for adding this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. Almost there - see comments and maybe add a few more tests
modelskill/metrics.py
Outdated
assert obs.size == model.size | ||
if len(obs) == 0: | ||
return np.nan | ||
time=obs.index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have an assert statement before this line checking that the inputs are indeed dataframes with a DatetimeIndex. All other metrics just assumes input is array-like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually.... we changed that in a previous PR, so the metrics inputs are now Pandas.Series I understand, see:
#215
I just copied the same lines from all the other metrics. But from the previous pull request I understand that now that the inputs changed to pandas series instead of arrays :S maybe be should change the all the metrics from
obs:array
to obs:series
what do you say? Maybe I am wrong and inputs are still arrays.
|
||
import modelskill.metrics as mtr | ||
|
||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since obs_series and mod_series are only used a single place I think it is better to define them inside test_pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
used on a single place so far ... if this makes running the tests slower I understand, but if not, will it no be easier for future tests to already have it as a pytest fixture? I can move it if you really think is better though.
\\frac{ \\sum_{i=1}^n (obs_i - \\overline{obs})^2 - \\sum_{i=1}^n \\left( (obs_i - \\overline{obs}) - (model_i - \\overline{model}) \\right)^2}{\\sum_{i=1}^n (obs_i - \\overline{obs})^2} | ||
|
||
Range: [0, 1]; Best: 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a "See Also" with reference to r2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is that I am not sure if it is the same, as r2 can be negative, and this ev cannot.
I do not have the reference for this metric.
Answered the comments here (waiting for confirmation) but regarding the tests, all metrics seem to have a single test. Ok Just gave it a thought. maybe if |
@daniel-caichac-DHI is this PR done in your opinion? You mentioned: "maybe if |
I was suppossed to do more tests but I didn't manage as I went on vacations :) |
I added the changes as requested. I did not add on purpose the 'example' of the peak ratio because it needs a timeseries as input (instead of just a numpy array) as it needs to detect the peaks on time, but did add the Explained Variance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Hi,
I added 2 new metrics that will most likely be used in the validation services.
1: Explained Variation (EV). That on is straight forward. Formula added in docstring.
2: Peak Ratio (PR):
For this one I need an ancilliary function called 'partial_duration_series', very similar to the PoT function from the EVA_editor in Mikezero toolbox (to identify peaks).
The peak ratio then is just the ratio of the mean of model peaks/ measured peaks.
Added scipy to dependencies list
Also added tests