-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training using SDP (and with DP by ratio?) #79
Comments
Can you update the result of train without SDP from scratch, then continue training using SDP and train with both SDP & DP by ratio ? |
It’s still training at the moment. I will update as soon as there are some notable results |
Hi isdanni, have you try WavLMDiscriminator like Bert-Vits2? WavLM is pre-trained on English and i wonder whether it can be used efficiently on other language? |
no i haven't, i'm training EN only so haven't experienced much with multilingual set up. If u tried it please post here would love to hear the result :) |
Well, I tried on my language and it produced very poor wavs. Now I'm trying a pre-trained phoneme as an input phoneme encoder. |
Good to know! Tbh I'm not sure if WavLMDiscriminator would provide any notable improvements on top of the existing discriminator. I'll update here if I get a chance to try it. |
This is a follow up to the previous discussion threads regarding stochastic duration predictor in #11 and #68 (comment), as well as with the reference of Bert-VITS2:
Regarding training using SDP, I have a few feedbacks:
A few months ago my experiments using
use_sdp
at earlier steps(100K ~ 500K) show below the average results compared to those trained withoutuse_sdp
, the audios did not sound natural and certain pronunciations are not clear. Now I plan to transfer learn a more well-trained checkpoint with SDP(like mentioned in the thread above), would be curious to hear anyone who has done similar experiments.I am curious to learn if adding
sdp_ratio
and training both SDP and DP simultaneously would offer any improvements to results. Not sure about how much code changes but would love to add a pr if this sounds good to you!About train both SDP & DP together and compare the result to save time(necessary of adversarial duration predictor #11 (comment)), if we train from scratch using this method my assumption is it does not sound good compared to two stage training.
DurationPredictor
works very well from my experience, but is there any improvement can be done regarding both DP models?==========================
A summary of my experience using
use_sdp
so far(will update later when I have more results):The text was updated successfully, but these errors were encountered: