Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some Questions and Comments #152

Open
tom99763 opened this issue Nov 7, 2022 · 1 comment
Open

Some Questions and Comments #152

tom99763 opened this issue Nov 7, 2022 · 1 comment

Comments

@tom99763
Copy link

tom99763 commented Nov 7, 2022

  1. Do you consider that instead of the feature map from CNN, using vector-quantized AE (VQVAE) for the future work? I think the result will be surprised due to its feature compression and sampleable properties for image-to-image translation task.

  2. It seems like the input-output pixel correlation largely impacts the translation result during early training process (multimodal translation or Animal-to-Human translation). Instead of predicting all at ones, two stage model (first contour, next texture) may improves the result.

Thank you

@tom99763 tom99763 changed the title Some Questions Some Questions and Comments Nov 7, 2022
@taesungp
Copy link
Owner

Hello, thanks for suggestions.

  1. I think incorporating VQVAE can be a good direction, particularly for saving compute.
  2. It may, especially if we go to higher resolution. But two-stage approaches are also more cumbersome to train.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants