Config file and performance reproduce #8

232525 · 2022-08-31T10:00:31Z

I re-train (8 V100) the mle phase using your released config file of configs/phase1/clipRN50_mle.yml, but the performance is lower than reported in the paper (CIDEr: 106.5 v.s 110.3). Does the config file correspond to the reported experiment in the paper?

The warmup step is set to 20000 in the config file, is it too large? The learning rate has been rising during the full training phase (just warmup without decreasing).

The text was updated successfully, but these errors were encountered:

j-min · 2022-08-31T23:49:35Z

Hi, the config file is adapted from the original config file of CLIP-RN50 transformer model. (https://github.com/clip-vil/CLIP-ViL/blob/master/CLIP-ViL-Direct/caption/configs/phrase1/transformer.yml). I only edited it with larger batch sizes and fp16 for faster training.
Since I didn't care about the warmup parameters, I didn't actually notice that the learning rate was not fully warmed up.
I am not entirely sure about the lower score issue at the moment. For your purpose, maybe you can run the training with smaller warmup steps.

232525 · 2022-09-01T02:55:07Z

Thanks for your reply. Would you please provide the wandb output.log file of your training process?

j-min · 2022-09-01T02:58:38Z

Back then I didn't use wandb, so I don't have log files for that run, sorry.

232525 · 2022-09-01T03:00:09Z

Sorry for another question, the training settings reported in your paper:

We train our model with MLE objective for 15 epochs and further train with different rewards for 25 epochs (total 40 epochs), which takes within 1 day with 8 V100 GPUs.

But I notice that the max_epoch is set to 25 in your first phase config file.

j-min · 2022-09-01T03:02:12Z

I just remember that I actually ran the original CLIP-ViL training script to run the MLE model.
Could you please run with the same batch size=10 for 25 epochs following https://github.com/clip-vil/CLIP-ViL/blob/master/CLIP-ViL-Direct/caption/configs/phrase1/transformer.yml?

232525 · 2022-09-01T03:03:20Z

I just remember that I actually ran the original CLIP-ViL training script to run the MLE model. Could you please run with the same batch size=10 for 25 epochs following https://github.com/clip-vil/CLIP-ViL/blob/master/CLIP-ViL-Direct/caption/configs/phrase1/transformer.yml?

With a single GPU?

j-min · 2022-09-01T03:03:30Z

Yes

232525 · 2022-09-01T03:04:52Z

Yes

OK, I will try soon. Thank you again.

j-min · 2022-09-01T03:05:16Z

For multi-gpus, I guess you could get the similar performance with fewer warmup steps, such as 1000 steps.

232525 · 2022-09-01T03:10:32Z

For multi-gpus, I guess you could get the similar performance with fewer warmup steps, such as 1000 steps.

Yes, I have tried with warmup steps of 1250, and the first phase reaches CIDEr 109.2. But the second phase (CIDEr RL with fix lr 2.5e-6) is worse (CIDEr 121.6 v.s 124.9 in paper).

j-min · 2022-09-01T03:24:18Z

Here I attach the output.log for the CIDER run. I used the same configuration (8 V100s, 25 batch size at each GPU) as the current config file.

cider_output.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config file and performance reproduce #8

Config file and performance reproduce #8

232525 commented Aug 31, 2022

j-min commented Aug 31, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

Config file and performance reproduce #8

Config file and performance reproduce #8

Comments

232525 commented Aug 31, 2022

j-min commented Aug 31, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022

232525 commented Sep 1, 2022

j-min commented Sep 1, 2022