End-to-end Speaker Verification (/w Spoof Aware) - PyTorch Implementation

A PyTorch-based implementation of Speaker Verification utilizing deep learning architectures and language models, with support for speaker recognition on the Vietnamese dataset by Dean Nguyen.

Architecture

ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification (Desplanques et al., 2020)
Analysis of Score Normalization in Multilingual Speaker Recognition (Matejka et al., 2017)
Speaker Recognition from raw waveform with SincNet (Ravanelli & Bengio, 2018)
GE2E Loss for Speaker Verification (Wan et al., 2018)
AAM-Softmax Loss for Speaker Verification (Deng et al., 2018)

Augmentation

MUSAN: A Music, Speech, and Noise Corpus (Snyder et al., 2015)
Room Impulse Response and Noise Database (Ko et al., 2017)
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition (Park et al., 2019)

Setup

conda create --name venv python=3.8.10
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install -r requirements.txt

Running

Change settings in setting/setting.yaml and run:

python cores/train.py

References

NOTE: This project was developed quite a while ago, so I may have missed some information about the repositories used. Please create a ticket so I can add them here.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Responsibility

This implementation is provided as-is, without any warranties or guarantees. The authors are not responsible for any misuse or damage caused by this software. Users are responsible for:

Ensuring proper data privacy and security when using this software
Complying with all applicable laws and regulations
Obtaining necessary permissions for any data used
Properly citing and acknowledging the original authors of the referenced papers
Understanding and accepting the limitations of the models and algorithms

The implementation is based on academic research papers and should be used for research purposes only. Commercial use may require additional permissions and compliance with relevant regulations.

Citation

If you use this code in your research, please cite:

@misc{deanng_2025,
    author = {Dean Nguyen},
    title = {End-to-end Speaker Verification - PyTorch Implementation},
    year = {2025},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/ducnt18121997/Viet-SASV}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cores		cores
models		models
setting		setting
utils		utils
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-end Speaker Verification (/w Spoof Aware) - PyTorch Implementation

Table of Contents

Architecture

Augmentation

Setup

Running

References

License

Responsibility

Citation

About

Releases

Languages

License

ducnt18121997/Viet-SASV

Folders and files

Latest commit

History

Repository files navigation

End-to-end Speaker Verification (/w Spoof Aware) - PyTorch Implementation

Table of Contents

Architecture

Augmentation

Setup

Running

References

License

Responsibility

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Languages