NLP-Emotion-Classification

Project Overview

This repository contains the implementation of a text classification project for identifying emotions (e.g., joy, sadness, fear, anger) in tweet texts. The project explores three deep learning approaches:

Fully Connected Neural Network (FCNN)
Recurrent Neural Network (RNN) with LSTM/GRU
Fine-tuned Transformer Model (e.g., BERT from HuggingFace)

The dataset consists of labeled tweet texts split into train.txt, test.txt, and validation.txt. The goal is to compare the performance of these models and analyze their effectiveness for emotion classification.

This project was developed as part of an NLP Exam by Alessandro Mencarelli, Mauricio Rodriguez, and Mario Zuna.

Repository Structure

NLP-Emotion-Classification/
│
├── data/
│   ├── train.txt
│   ├── test.txt
│   └── validation.txt
│
├── notebooks/
│   └── Emotion_Classification_Notebook.ipynb
│
├── models/
│   ├── fcnn_model.h5
│   ├── rnn_model.h5
│   └── transformer_model/
│
├── README.md
└── requirements.txt

Notebook

The main notebook, Emotion_Classification_Notebook.ipynb, contains:

Data Preprocessing: Tokenization, padding, and encoding of tweet texts.
Model Implementation:
- Fully Connected Neural Network (FCNN)
- Recurrent Neural Network (RNN) with LSTM/GRU
- Fine-tuned Transformer Model (BERT)
Results Comparison: Analysis of model performance and insights.

Requirements

To run the notebook, install the required dependencies:

pip install -r requirements.txt

How to Use

Clone the repository:

git clone https://git.1-hub.cnmenca-lsn/NLP-Emotion-Classification.git
cd NLP-Emotion-Classification

Install dependencies:
```
pip install -r requirements.txt
```

Open the notebook:

jupyter notebook notebooks/Emotion_Classification_Notebook.ipynb

Follow the instructions in the notebook to preprocess the data, train the models, and evaluate their performance.

Dataset

The dataset consists of tweet texts labeled with emotions (e.g., joy, sadness, fear, anger). It is split into:

Train set: Used for training the models.
Test set: Used for evaluating model performance during development.
Validation set: Used for final performance evaluation to ensure no overfitting.

Models

Fully Connected Neural Network (FCNN):
- A simple feedforward neural network for baseline performance.
Recurrent Neural Network (RNN):
- Uses LSTM/GRU layers to capture sequential dependencies in text data.
Transformer Model (BERT):
- Fine-tunes a pretrained BERT model from HuggingFace for state-of-the-art performance.

Results

FCNN: Achieved 0.82 accuracy.
RNN: Achieved 0.89 accuracy.
Transformer (BERT): Achieved 0.92 accuracy, leveraging pretrained language representations for the best performance.

Happy coding! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP-Emotion-Classification

Project Overview

Repository Structure

Notebook

Requirements

How to Use

Dataset

Models

Results

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
models		models
notebooks		notebooks
README.md		README.md
requirements.txt		requirements.txt

menca-lsn/NLP-Emotion-Classification

Folders and files

Latest commit

History

Repository files navigation

NLP-Emotion-Classification

Project Overview

Repository Structure

Notebook

Requirements

How to Use

Dataset

Models

Results

About

Topics

Resources

Stars

Watchers

Forks

Languages