[CVPR2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing"
Our python version is 3.8.18
and cuda version 11.8
. It's possible to have another compatible version.
Both training and inference are implemented with PyTorch on a
GeForce RTX 4090 GPU.
conda create -n dubbing python=3.8.18
conda activate dubbing
pip install -r requirements.txt
python train_first.py -p Configs/config_v2c_stage1.yml # V2C-Animation benchmark
python train_first.py -p Configs/config_grid_stage1.yml # GRID benchmark
python train_second.py -p Configs/config_v2c.yml # V2C-Animation benchmark
python train_second_grid.py -p Configs/config_grid.yml # GRID benchmark
We provide the first stage and second stage pre-trained checkpoints on V2C-Animation and GRID benchmarks as follows, respectively:
-
V2C-Animation benchmark: Baidu Drive (b5wy), Google Drive.
-
GRID benchmark: Baidu Drive (wj25), Google Drive
-
V2C-Animation benchmark: Baidu Drive (3k4h), Google Drive.
-
GRID benchmark: Baidu Drive (23vd), Google Drive
There is three generation settings in V2C-Animation benchmark:
python inference_v2c.py -n 'YOUR_EXP_NAME' --epoch 'YOUR_EPOCH' --setting 1
python inference_v2c.py -n 'YOUR_EXP_NAME' --epoch 'YOUR_EPOCH' --setting 2
python inference_v2c.py -n 'YOUR_EXP_NAME' --epoch 'YOUR_EPOCH' --setting 3
There is two generation settings in GRID benchmark:
python inference_grid.py -n 'YOUR_EXP_NAME' --epoch 'YOUR_EPOCH' --setting 1
python inference_grid.py -n 'YOUR_EXP_NAME' --epoch 'YOUR_EPOCH' --setting 2
- GRID (BaiduDrive (code: GRID) / GoogleDrive)
- V2C-Animation dataset (chenqi-Denoise2) (BaiduDrive (code: k9mb) / GoogleDrive)
We would like to thank the authors of previous related projects for generously sharing their code and insights: StyleTTS, StyleTTS2, StyleDubber, PL-BERT, and HiFi-GAN.
If you find our work useful, please consider citing:
@misc{zhang2025produbber,
title={Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing},
author={Zhedong Zhang and Liang Li and Chenggang Yan and Chunshan Liu and Anton van den Hengel and Yuankai Qi},
year={2025},
eprint={2503.12042},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2503.12042},
}