Skip to content
/ REval Public

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

License

Notifications You must be signed in to change notification settings

DFKI-NLP/REval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Christoph Alt
Apr 21, 2020
4806ad5 Β· Apr 21, 2020

History

1 Commit
Apr 21, 2020
Apr 21, 2020
Apr 21, 2020
Apr 21, 2020
Apr 21, 2020
Apr 21, 2020
Apr 21, 2020

Repository files navigation

REval

Table of Contents

πŸŽ“  Introduction

REval is a simple framework for probing sentence-level representations of Relation Extraction models.

βœ…  Requirements

REval is tested with:

  • Python 3.7

πŸš€  Installation

With pip

<TBD>

From source

git clone https://github.com/DFKI-NLP/REval
cd REval
pip install -r requirements.txt

πŸ”¬  Probing

Supported Datasets

  • SemEval 2010 Task 8 (CoreNLP annotated version) [LINK]
  • TACRED (obtained via LDC) [LINK]

Probing Tasks

Task SemEval 2010 TACRED
ArgTypeHead βœ”οΈ βœ”οΈ
ArgTypeTail βœ”οΈ βœ”οΈ
Length βœ”οΈ βœ”οΈ
EntityDistance βœ”οΈ βœ”οΈ
ArgumentOrder βœ”οΈ
EntityExistsBetweenHeadTail βœ”οΈ βœ”οΈ
PosTagHeadLeft βœ”οΈ βœ”οΈ
PosTagHeadRight βœ”οΈ βœ”οΈ
PosTagTailLeft βœ”οΈ βœ”οΈ
PosTagTailRight βœ”οΈ βœ”οΈ
TreeDepth βœ”οΈ βœ”οΈ
SDPTreeDepth βœ”οΈ βœ”οΈ
ArgumentHeadGrammaticalRole βœ”οΈ βœ”οΈ
ArgumentTailGrammaticalRole βœ”οΈ βœ”οΈ

πŸ”§  Usage

Step 1: create the probing task datasets from the original datasets.

SemEval 2010 Task 8

python reval.py generate-all-from-semeval \
    --train-file <SEMEVAL DIR>/train.json \
    --validation-file <SEMEVAL DIR>/dev.json \
    --test-file <SEMEVAL DIR>/test.json \
    --output-dir ./data/semeval/

TACRED

python reval.py generate-all-from-tacred \
    --train-file <TACRED DIR>/train.json \
    --validation-file <TACRED DIR>/dev.json \
    --test-file <TACRED DIR>/test.json \
    --output-dir ./data/tacred/

Step 2: Run the probing tasks on a model.

For example, download a Relation Extraction model trained with RelEx, e.g., the CNN trained on SemEval.

mkdir -p models/cnn_semeval
wget --content-disposition https://cloud.dfki.de/owncloud/index.php/s/F3gf9xkeb2foTFe/download -P models/cnn_semeval
python probing_task_evaluation.py \
    --model-dir ./models/cnn_semeval/ \
    --data-dir ./data/semeval/ \
    --dataset semeval2010 \
    --cuda-device 0 \
    --batch-size 64 \
    --cache-representations

After the run is completed, the results are stored to probing_task_results.json in the model-dir.

{
    "ArgTypeHead": {
        "acc": 75.82,
        "devacc": 78.96,
        "ndev": 670,
        "ntest": 2283
    },
    "ArgTypeTail": {
        "acc": 75.4,
        "devacc": 78.79,
        "ndev": 627,
        "ntest": 2130
    },
    [...]
}

πŸ“š  Citation

If you use REval, please consider citing the following paper:

@inproceedings{alt-etal-2020-probing,
    title={Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction},
    author={Christoph Alt and Aleksandra Gabryszak and Leonhard Hennig},
    year={2020},
    booktitle={Proceedings of ACL},
    url={https://arxiv.org/abs/2004.08134}
}

πŸ“˜  License

REval is released under the terms of the MIT License.

About

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages