Skip to content

Multiscale Generation for Beginners

ProGamerGov edited this page Nov 10, 2019 · 9 revisions

Multiscale Generation

When starting off, you want to maximize quality by using the L-BFGS optimizer and a non-pruned VGG model. After the image size gets too large and runs out of memory, then you start using Adam. After that runs out of memory, then you transition to the channel pruning/NIN models with Adam.

If you want to repeat the multiscale resolution script a few times to enhance output image quality, then you go back to step 1 when L-BFGS and your non-pruned VGG model run out of memory. This helps maximize output image quality.


For a simple example, this is what you are essentially doing with multires:

python3 neural_style.py -output_image out1.png -image_size 512

python3 neural_style.py -output_image out2.png -init image -init_image out1.png -image_size 720

python3 neural_style.py -output_image out3.png -init image -init_image out2.png -image_size 1024

python3 neural_style.py -output_image out4.png -init image -init_image out3.png -image_size 1536

Basically the closer the image size to the training data image size used to train a model, the more change that will occur in the output image. If you start at the maximum possible image size, then it's going to look like a weak filter. So, you start closer to the size of the images used to train the model (ex: 224, 512), then you slowly make the image size larger, so that details can be properly formed. By the time you hit the maximum image size that you do, things are only changing on a smaller level (relative to the rest of the image) and you may have to zoom in on an image to see the difference. You can also play around with how many steps you use, and what image size each step has.

  • You can find the other models that show up in others multires scripts for neural-style, converted to PyTorch for neural-style-pt here.

So, multires is essentially running a style transfer script repeatedly to increment the output image to the desired size.


Sometimes, I like to run my multires scripts multiple times like this, because I find it makes the results look better:

python3 neural_style.py -output_image out1.png -image_size 512

python3 neural_style.py -output_image out2.png -init image -init_image out1.png -image_size 720

python3 neural_style.py -output_image out3.png -init image -init_image out2.png -image_size 1024

python3 neural_style.py -output_image out4.png -init image -init_image out3.png -image_size 1536

Then followed by:

python3 neural_style.py -output_image out1.png -init image -init_image out4.png -image_size 512

python3 neural_style.py -output_image out2.png -init image -init_image out1.png -image_size 720

python3 neural_style.py -output_image out3.png -init image -init_image out2.png -image_size 1024

python3 neural_style.py -output_image out4.png -init image -init_image out3.png -image_size 1536

The second, third, forth, etc... runs use the previous run's output image as the initialization image for step 1. The most amount of times I've run an output image through a multires script, was 7 times. This seems to help make smaller details look better and better resemble the smaller details from the style image.



It was also discovered that matching input image histograms, improve style transfer results. You can find histogram matching scripts here: https://github.com/ProGamerGov/Neural-Tools

You can do histogram matching after every style transfer script like this:

python3 neural_style.py -output_image out1.png -image_size 512

python linear-color-transfer.py --target_image out1.png --source_image style_image.png --output_image out1_hist.png

python3 neural_style.py -output_image out2.png -init image -init_image out1_hist.png -image_size 720

python linear-color-transfer.py --target_image out2.png --source_image style_image.png --output_image out2_hist.png

python3 neural_style.py -output_image out3.png -init image -init_image out2_hist.png -image_size 1024

python linear-color-transfer.py --target_image out3.png --source_image style_image.png --output_image out3_hist.png

python3 neural_style.py -output_image out4.png -init image -init_image out3_hist.png -image_size 1536

And you can also do histogram matching on your input images like this:

python linear-color-transfer.py --target_image content_image.jpg --source_image style_image.png --output_image content_hist.png

python3 neural_style.py -content_image content_hist.png -style_image style_image.png -output_image out1.png -image_size 512

python3 neural_style.py -content_image content_hist.png -style_image style_image.png -output_image out2.png -init image -init_image out1.png -image_size 720

python3 neural_style.py -content_image content_hist.png -style_image style_image.png -output_image out3.png -init image -init_image out2.png -image_size 1024

python3 neural_style.py -content_image content_hist.png -style_image style_image.png -output_image out4.png -init image -init_image out3.png -image_size 1536

You can also combine the two ways of histogram matching, like this:

python linear-color-transfer.py --target_image content_image.jpg --source_image style_image.png --output_image content_hist.png

python3 neural_style.py -content_image content_hist.png -style_image style_image.png -output_image out1.png -image_size 512

python linear-color-transfer.py --target_image out1.png --source_image style_image.png --output_image out1_hist.png

python3 neural_style.py -content_image content_hist.png -style_image style_image.png -output_image out2.png -init image -init_image out1_hist.png -image_size 720

python linear-color-transfer.py --target_image out2.png --source_image style_image.png --output_image out2_hist.png

python3 neural_style.py -content_image content_hist.png -style_image style_image.png -output_image out3.png -init image -init_image out2_hist.png -image_size 1024