Difficulty Score Docs #633

marko-polo-cheno · 2024-06-29T08:11:11Z

Linked issue(s)

Fixes KOL-6787

What change does this PR introduce and why?

Adds docs for difficulty score

codecov · 2024-06-29T08:21:17Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.57%. Comparing base (0285b43) to head (737400d).

Additional details and impacted files

@@            Coverage Diff             @@
##            trunk     #633      +/-   ##
==========================================
- Coverage   94.49%   93.57%   -0.93%     
==========================================
  Files          86       86              
  Lines        5397     5397              
  Branches      792      792              
==========================================
- Hits         5100     5050      -50     
- Misses        220      260      +40     
- Partials       77       87      +10

Flag	Coverage Δ
integration	`76.24% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

y27choi · 2024-07-03T14:53:47Z

docs/metrics/difficulty-score.md

+
+# Difficulty Score
+
+Difficulty scores are automatically computed within Kolena to surface datapoints that commonly contribute to poor


I would link Kolena and datapoints so that anyone landing on this page without any context can navigate themselves to the right places.

y27choi · 2024-07-03T16:54:48Z

docs/metrics/difficulty-score.md

+    With a filter for `datapoint.difficulty_score > 0.9`, we see all the datapoints that significantly struggle
+    across both the `old` and `new` models, which are common failures that persist over different model iterations.
+
+## Implementation Details


Before we jump to the implementation details, should we summarize the logic here? Also, can this be summarized into a formula? One formula for model-level difficulty score and one for the aggregate difficulty score.

y27choi · 2024-07-03T16:56:46Z

docs/metrics/difficulty-score.md

+
+#### Multiclass Classification
+
+The `delta` column for a datapoint of a multiclass classification task is simply the number of times a model


What does it mean by number of times? Isn't is just a correct vs. incorrect classification per datapoint?

y27choi · 2024-07-03T16:57:47Z

docs/metrics/index.md

+
+    ---
+
+    Difficulty scores indicate which datapoints models commonly struggle on based on custom Quality Standards.


struggle on based on -> struggle based on

y27choi

I made some final edits but overall very well written doc and this doc should demystify how we are computing the "difficulty score" on our app. Once we get a thumbs up from @mkaramlou I will merge it.

mkaramlou

Changes look good!

difficulty score docs without images

3137ad0

marko-polo-cheno requested a review from a team as a code owner June 29, 2024 08:11

marko-polo-cheno requested a review from mkaramlou July 2, 2024 17:22

y27choi reviewed Jul 3, 2024

View reviewed changes

marko-polo-cheno added 2 commits July 3, 2024 11:29

use terminology seen in studio and added formulas

2d03dd2

nit

6b255f1

marko-polo-cheno requested a review from y27choi July 3, 2024 18:33

y27choi added 2 commits July 4, 2024 09:47

Merge remote-tracking branch 'origin/trunk' into difficulty-score-docs

003c4bc

Last edits on the difficulty score page

01b5198

y27choi approved these changes Jul 4, 2024

View reviewed changes

small updates to the doc

737400d

mkaramlou approved these changes Jul 4, 2024

View reviewed changes

y27choi merged commit f20c065 into trunk Jul 6, 2024
33 checks passed

y27choi deleted the difficulty-score-docs branch July 6, 2024 20:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difficulty Score Docs #633

Difficulty Score Docs #633

marko-polo-cheno commented Jun 29, 2024

codecov bot commented Jun 29, 2024 •

edited

Loading

y27choi Jul 3, 2024

y27choi Jul 3, 2024

y27choi Jul 3, 2024

y27choi Jul 3, 2024

y27choi left a comment

mkaramlou left a comment


		# Difficulty Score

		Difficulty scores are automatically computed within Kolena to surface datapoints that commonly contribute to poor


		#### Multiclass Classification

		The `delta` column for a datapoint of a multiclass classification task is simply the number of times a model


		---

		Difficulty scores indicate which datapoints models commonly struggle on based on custom Quality Standards.

Difficulty Score Docs #633

Difficulty Score Docs #633

Conversation

marko-polo-cheno commented Jun 29, 2024

Linked issue(s)

What change does this PR introduce and why?

codecov bot commented Jun 29, 2024 • edited Loading

Codecov Report

y27choi Jul 3, 2024

Choose a reason for hiding this comment

y27choi Jul 3, 2024

Choose a reason for hiding this comment

y27choi Jul 3, 2024

Choose a reason for hiding this comment

y27choi Jul 3, 2024

Choose a reason for hiding this comment

y27choi left a comment

Choose a reason for hiding this comment

mkaramlou left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 29, 2024 •

edited

Loading