Name	Name	Last commit message	Last commit date
Latest commit ggbetz v0.1.12 Jun 27, 2022 86e8ed3 · Jun 27, 2022 History 240 Commits
.github	.github	ubunutu-20.04 for test37	May 19, 2022
deepa2	deepa2	argkp (KPA)	Jun 27, 2022
dist	dist	Patch: fix data serving bug	Feb 22, 2022
docs	docs	argkp (KPA)	Jun 27, 2022
notebooks	notebooks	Create README.md	Feb 22, 2022
scripts/da2_metric	scripts/da2_metric	move metric script	May 31, 2022
templates	templates	aifdb	Jan 27, 2022
tests	tests	argkp (KPA)	Jun 27, 2022
.flake8	.flake8	code style	Feb 3, 2022
.gitignore	.gitignore	work on parser	May 19, 2022
CITATION.cff	CITATION.cff	Create CITATION.cff	Feb 20, 2022
LICENSE	LICENSE	Create LICENSE	Jan 25, 2022
README.md	README.md	Code climate badge	Feb 26, 2022
code_quality_checks.sh	code_quality_checks.sh	pylint update	Jun 24, 2022
poetry.lock	poetry.lock	pylint update	Jun 24, 2022
pyproject.toml	pyproject.toml	v0.1.12	Jun 27, 2022
setup.py	setup.py	setup initial structure	Jan 18, 2022

Repository files navigation

Deep Argument Analysis (`deepa2`)

This project provides deepa2, which

🥚 takes NLP data (e.g. NLI, argument mining) as ingredients;
🎂 bakes DeepA2 datatsets conforming to the Deep Argument Analysis Framework;
🍰 serves DeepA2 data as text2text datasets suitable for training language models.

There's a public collection of 🎂 DeepA2 datatsets baked with deepa2 at the HF hub.

The Documentation describes usage options and gives background info on the Deep Argument Analysis Framework.

Quickstart

Integrating `deepa2` into Your Training Pipeline

Install deepa2 into your ML project's virtual environment, e.g.:

source my-projects-venv/bin/activate 
python --version  # should be ^3.7
python -m pip install deepa2

Add deepa2 preprocessor to your training pipeline. Your training script may look like, for example:

#!/bin/bash

# configure and activate environment
...

# download deepa2 datasets and 
# prepare for text2text training
deepa2 serve \
    --path some-deepa2-dataset \    # <<< 🎂
    --export_format csv \
    --export_path t2t \             # >>> 🍰

# run default training script, 
# e.g., with 🤗 Transformers
python .../run_summarization.py \
    --train_file t2t/train.csv \    # <<< 🍰
    --text_column "text" \
    --summary_column "target" \
    --...

# clean-up
rm -r t2t

That's it.

Create DeepA2 datasets with `deepa2` from existing NLP data

Install poetry.

Clone the repository:

git clone https://github.com/debatelab/deepa2-datasets.git

Install this package from within the repo's root folder:

poetry install

Bake a DeepA2 dataset, e.g.:

poetry run deepa2 bake \\
  --name esnli \\                   # <<< 🥚
  --debug-size 100 \\
  --export-path ./data/processed    # >>> 🎂

Contribute a DeepA2Builder for another Dataset

We welcome contributions to this repository, especially scripts that port existing datasets to the DeepA2 Framework. Within this repo, a code module that transforms data into the DeepA2 format contains

a Builder class that describes how DeepA2 examples will be constructed and that implements the abstract builder.Builder interface (such as, e.g., builder.entailmentbank_builder.EnBankBuilder);
a DataLoader which provides a method for loading the raw data as a 🤗 Dataset object (such as, for example, builder.entailmentbank_builder.EnBankLoader) -- you may use deepa2.DataLoader as is in case the data is available in a way compatible with 🤗 Dataset;
dataclasses which describe the features of the raw data and the preprocessed data, and which extend the dummy classes deepa2.RawExample and deepa2.PreprocessedExample;
a collection of unit tests that check the concrete Builder's methods (such as, e.g., tests/test_enbank.py);
a documentation of the pipeline (as for example in docs/esnli.md).

Consider suggesting to collaboratively construct such a pipeline by opening a new issue.

Citation

This repository builds on and extends the DeepA2 Framework originally presented in:

@article{betz2021deepa2,
      title={DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models}, 
      author={Gregor Betz and Kyle Richardson},
      year={2021},
      eprint={2110.01509},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Argument Analysis (`deepa2`)

Quickstart

Integrating `deepa2` into Your Training Pipeline

Create DeepA2 datasets with `deepa2` from existing NLP data

Contribute a DeepA2Builder for another Dataset

Citation

About

Releases 16

Packages

Contributors 2

Languages

License

debatelab/deepa2

Folders and files

Latest commit

History

Repository files navigation

Deep Argument Analysis (deepa2)

Quickstart

Integrating deepa2 into Your Training Pipeline

Create DeepA2 datasets with deepa2 from existing NLP data

Contribute a DeepA2Builder for another Dataset

Citation

About

Topics

Resources

License

Citation

Stars

Watchers

Forks

Releases 16

Packages 0

Contributors 2

Languages

Deep Argument Analysis (`deepa2`)

Integrating `deepa2` into Your Training Pipeline

Create DeepA2 datasets with `deepa2` from existing NLP data

Packages