-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is it make sense to try image loss in stage 2? #122
Comments
also i find sometimes used gumbel-softmax(hard=false) will show a better result for train,but it is a bad set for use hard=false? if codebook is limited, will mixed token show more performance ? my ability doesn't support my question and i could'n find useful research .Thanks for ur excellent work again!! |
@AlexzQQQ I think although it's adventageous to mix token probability given each code represent distinct feature in theory, whether the pretrained VQ-VAE can fully utilize the rich latent repretentation produce by the mixed token probability or it will simply collapse eventually in the 2nd stage of training is worth exploring. |
@jack111331 thx for u reply , I will try what your questioned in serveral weeks due to busy work. the mix token will useful if loss is not only crossentropy loss in my opinion. |
I tried to use gumbel-softmax to transfer latent prediction to image for several loss(like perceptual loss、l1、adv , tried aims to achieve several tasks that crossentropyloss might not suit well. ) in stage2(transformer train period), all of them seemed not work. I wonder if my thoughts was wrong. Thanks for ur excellent work!!
The text was updated successfully, but these errors were encountered: