Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding arc_easy, arc_challenge, mutual, mutual_plus #1206

Merged
merged 10 commits into from
Oct 21, 2020
Merged

Adding arc_easy, arc_challenge, mutual, mutual_plus #1206

merged 10 commits into from
Oct 21, 2020

Conversation

yzpang
Copy link
Member

@yzpang yzpang commented Oct 20, 2020

After some discussion, we decided to implement different versions of a task (e.g., ARC) as two separate tasks. All the tasks in this PR (arc_easy, arc_challenge, mutual, mutual_plus) have the multiple choice formats.

To make things easier I used colabs to experiment. Here are the scripts: arc_easy, arc_challenge, mutual, mutual_plus.

Accuracies with RoBERTa-base:

  • arc_easy: around 0.553
  • arc_challenge: around 0.355 (discussed in meeting; this is low but expected given that the task is difficult, and we didn't use information retrieval models which the top-performing models on the leaderboard uses)
  • mutual: around 0.709
  • mutual_plus: around 0.643

Papers

ARC caveats

  • The scripts are a bit complicated given that in the dataset, sometimes the examples are labeled by "A" to "D" and sometimes "1" to "4". Sometimes, there are five choices instead of four.

Please let me know if you need me to edit anything / if you need more info.

@codecov
Copy link

codecov bot commented Oct 20, 2020

Codecov Report

Merging #1206 into master will increase coverage by 0.21%.
The diff coverage is 69.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1206      +/-   ##
==========================================
+ Coverage   56.91%   57.12%   +0.21%     
==========================================
  Files         133      137       +4     
  Lines        9696     9862     +166     
==========================================
+ Hits         5518     5634     +116     
- Misses       4178     4228      +50     
Impacted Files Coverage Δ
...cripts/download_data/datasets/hf_datasets_tasks.py 100.00% <ø> (ø)
jiant/tasks/constants.py 100.00% <ø> (ø)
jiant/tasks/evaluate/core.py 37.15% <ø> (ø)
jiant/tasks/lib/arc_challenge.py 62.22% <62.22%> (ø)
jiant/tasks/lib/arc_easy.py 62.22% <62.22%> (ø)
jiant/tasks/lib/mutual.py 77.77% <77.77%> (ø)
jiant/tasks/lib/mutual_plus.py 77.77% <77.77%> (ø)
jiant/tasks/retrieval.py 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b3dff5...67f4fe9. Read the comment docs.

Copy link
Collaborator

@zphang zphang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge from master and we should be good to go.

@yzpang yzpang merged commit da7550d into nyu-mll:master Oct 21, 2020
leo-liuzy pushed a commit to leo-liuzy/dynamic_jiant that referenced this pull request Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants