Adding arc_easy, arc_challenge, mutual, mutual_plus #1206

yzpang · 2020-10-20T19:50:38Z

After some discussion, we decided to implement different versions of a task (e.g., ARC) as two separate tasks. All the tasks in this PR (arc_easy, arc_challenge, mutual, mutual_plus) have the multiple choice formats.

To make things easier I used colabs to experiment. Here are the scripts: arc_easy, arc_challenge, mutual, mutual_plus.

Accuracies with RoBERTa-base:

arc_easy: around 0.553
arc_challenge: around 0.355 (discussed in meeting; this is low but expected given that the task is difficult, and we didn't use information retrieval models which the top-performing models on the leaderboard uses)
mutual: around 0.709
mutual_plus: around 0.643

Papers

ARC caveats

The scripts are a bit complicated given that in the dataset, sometimes the examples are labeled by "A" to "D" and sometimes "1" to "4". Sometimes, there are five choices instead of four.

Please let me know if you need me to edit anything / if you need more info.

Updating my own forked jiant v1 to v2

Updating jiant

codecov · 2020-10-20T23:12:17Z

Codecov Report

Merging #1206 into master will increase coverage by 0.21%.
The diff coverage is 69.87%.

@@            Coverage Diff             @@
##           master    #1206      +/-   ##
==========================================
+ Coverage   56.91%   57.12%   +0.21%     
==========================================
  Files         133      137       +4     
  Lines        9696     9862     +166     
==========================================
+ Hits         5518     5634     +116     
- Misses       4178     4228      +50

Impacted Files	Coverage Δ
...cripts/download_data/datasets/hf_datasets_tasks.py	`100.00% <ø> (ø)`
jiant/tasks/constants.py	`100.00% <ø> (ø)`
jiant/tasks/evaluate/core.py	`37.15% <ø> (ø)`
jiant/tasks/lib/arc_challenge.py	`62.22% <62.22%> (ø)`
jiant/tasks/lib/arc_easy.py	`62.22% <62.22%> (ø)`
jiant/tasks/lib/mutual.py	`77.77% <77.77%> (ø)`
jiant/tasks/lib/mutual_plus.py	`77.77% <77.77%> (ø)`
jiant/tasks/retrieval.py	`100.00% <100.00%> (ø)`
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b3dff5...67f4fe9. Read the comment docs.

zphang

Merge from master and we should be good to go.

yzpang and others added 8 commits October 11, 2020 23:43

Merge pull request #1 from nyu-mll/master

9a73d25

Updating my own forked jiant v1 to v2

adding arc and mutual

d318fa6

adding arc and mutual

4e530d7

Merge pull request #2 from nyu-mll/master

6354ede

Updating jiant

adding arc and mutual

9c2abbf

removing extra files

1d751b5

removing extra files

4d91fde

removing extra files

e3c897a

yzpang requested review from HaokunLiu, jeswan and zphang as code owners October 20, 2020 19:50

fixing black style

2c47973

zphang approved these changes Oct 21, 2020

View reviewed changes

Merge branch 'master' into master

67f4fe9

yzpang merged commit da7550d into nyu-mll:master Oct 21, 2020

leo-liuzy pushed a commit to leo-liuzy/dynamic_jiant that referenced this pull request Nov 11, 2020

Adding arc_easy, arc_challenge, mutual, mutual_plus (nyu-mll#1206)

a33aa2b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding arc_easy, arc_challenge, mutual, mutual_plus #1206

Adding arc_easy, arc_challenge, mutual, mutual_plus #1206

yzpang commented Oct 20, 2020 •

edited

Loading

codecov bot commented Oct 20, 2020 •

edited

Loading

zphang left a comment

Adding arc_easy, arc_challenge, mutual, mutual_plus #1206

Adding arc_easy, arc_challenge, mutual, mutual_plus #1206

Conversation

yzpang commented Oct 20, 2020 • edited Loading

codecov bot commented Oct 20, 2020 • edited Loading

Codecov Report

zphang left a comment

Choose a reason for hiding this comment

yzpang commented Oct 20, 2020 •

edited

Loading

codecov bot commented Oct 20, 2020 •

edited

Loading