-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out-of-memory during fine-tuning #59
Comments
Repo author mentioned in issue #48 that a
|
Please check the colab demo: https://github.com/yl4579/StyleTTS2/blob/main/Colab/StyleTTS2_Finetune_Demo.ipynb. You can finetune with only a batch size of 2, but try not reduce |
@yl4579 Thank you for contributing such a great job. BTW, why does StyleTTS-2 require so much GPU memory? I tried to finetune it on A800 (80GB) and the only change I made was to set the batch size to 4, which requires nearly 68GB in epoch 15. At the beginning of fine-tuning, it seems to only update Is this normal? |
@cnlinxi It doesn’t update the parameters of WavLM but it does use its gradient to train the generator. This is unfortunately one of the limitations of using large speech language models. Probably future works can resolve it. You can also skip the joint training part, but it will significantly worsen the quality as we discussed earlier in this thread. |
I tried using bitsandbytes 8Bit optimizer but it was just wrong (I don't know anything about it so I may have done something wrong), but I end up with equal VRAM usage and slower speed. |
I'm trying to run a fine-tuning on a dataset of roughly 1.5h of audio data with an average audio length of ~7.5s. As hardware I'm having 4 GeForce 3090 GPUs with 24GB RAM each. Unfortunately the training always crashes due to OOM (out-of-memory) after the first couple of steps. I've checked both the README and the discussion in issue #10, but none of the suggested things seem to work. I.e. even using values such as
max_len: 50
batch_precentage: 0.125
are not working. I'm using the command as suggested in the README, i.e.
where
config_ft.yml
is as before but updated with above values and my own dataset.Any other suggestions on what I could do to make the training run and avoid running OOM?
The text was updated successfully, but these errors were encountered: