Label Unbalance in High Frequency Trading

Report: Label Unbalance in High-frequency Trading

Method

In this project, we mainly focus on the high frequency trading prediction in the scenario of label imbalance, based on machine learning methods. We provide four networks including: MLP, LSTM, BERT, and Mamba as the backbone. We use two simple methods to reduce the influence of label imbalance, including: resampling and class weighting.

Dataset

Due to copyright restrictions, we do not provide the original data. However, we provide the data structure and class distribution in our experiment. As a result, you can replace the dataset.py file with your own data and change some necessary parameters in train_classification.py to run the code on your dataset.

How to Run

To run the model, you can use the following command:

python train_classification.py --input_dim <input dim> --output_dim <class numbers> --model <please select from 'mlp', 'lstm', 'bert', and 'mamba'>

To change the model scale or other parameters related to training, please refer to the get_args function.

We provide the following methods for addressing label imbalance:

--class_weight: You can specify the weight for each class in the loss function.
--data_balance: We randomly delete 7/8 of the data in the '0' class to balance the dataset.

Reference

@misc{zhao2025labelunbalancehighfrequencytrading,
      title={Label Unbalance in High-frequency Trading}, 
      author={Zijian Zhao and Xuming Zhang and Jiayu Wen and Mingwen Liu and Xiaoteng Ma},
      year={2025},
      eprint={2503.09988},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2503.09988}, 
}

Name		Name	Last commit message	Last commit date
Latest commit RS2002 Update README.md Mar 20, 2025 8dda42e · Mar 20, 2025 History 31 Commits
data_process		data_process
discard		discard
img		img
README.md		README.md
dataset.py		dataset.py
fixed_cost_sensitive_learning.py		fixed_cost_sensitive_learning.py
models.py		models.py
train_classification.py		train_classification.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Label Unbalance in High Frequency Trading

Method

Dataset

How to Run

Reference

About

Releases

Packages

Contributors 3

Languages

RS2002/Label-Unbalance-in-High-Frequency-Trading

Folders and files

Latest commit

History

Repository files navigation

Label Unbalance in High Frequency Trading

Method

Dataset

How to Run

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages