-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config file and performance reproduce #8
Comments
Hi, the config file is adapted from the original config file of CLIP-RN50 transformer model. (https://github.com/clip-vil/CLIP-ViL/blob/master/CLIP-ViL-Direct/caption/configs/phrase1/transformer.yml). I only edited it with larger batch sizes and fp16 for faster training. |
Thanks for your reply. Would you please provide the |
Back then I didn't use wandb, so I don't have log files for that run, sorry. |
Sorry for another question, the training settings reported in your paper:
But I notice that the |
I just remember that I actually ran the original CLIP-ViL training script to run the MLE model. |
With a single GPU? |
Yes |
OK, I will try soon. Thank you again. |
For multi-gpus, I guess you could get the similar performance with fewer warmup steps, such as 1000 steps. |
Yes, I have tried with warmup steps of 1250, and the first phase reaches CIDEr 109.2. But the second phase (CIDEr RL with fix lr 2.5e-6) is worse (CIDEr 121.6 v.s 124.9 in paper). |
Here I attach the output.log for the CIDER run. I used the same configuration (8 V100s, 25 batch size at each GPU) as the current config file. |
I re-train (8 V100) the mle phase using your released config file of
configs/phase1/clipRN50_mle.yml
, but the performance is lower than reported in the paper (CIDEr: 106.5 v.s 110.3). Does the config file correspond to the reported experiment in the paper?The warmup step is set to 20000 in the config file, is it too large? The learning rate has been rising during the full training phase (just warmup without decreasing).
The text was updated successfully, but these errors were encountered: