Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper's VGG-19 accuracy question #8

Open
simo23 opened this issue Sep 20, 2017 · 21 comments
Open

Paper's VGG-19 accuracy question #8

simo23 opened this issue Sep 20, 2017 · 21 comments

Comments

@simo23
Copy link

simo23 commented Sep 20, 2017

Hi, first of all thanks for your great work!

In your paper you cite the VGG-19 [27] model and state that on the CUB-200-2011 dataset it achieves 77.8% accuracy. Can you please give some more info about this? Are you referring to the only Imagenet trained model? Or on the fine-tuned by you model? Or fine-tuned by someone else model? Is it the Caffe model?

And if you did train it can you share some of the details like batch size, learning rate, epochs of the training, data augmentation?

Thanks,
Andrea

@simo23
Copy link
Author

simo23 commented Sep 30, 2017 via email

@chenbinghui1
Copy link

@simo23 Hi
To your questions:

  1. I use VGG19 pretrainted on imagenet
  2. I fine-tune the VGG19 model on CUB with 448x448
  3. When fine-tune with 448 I change the last pooling layer to stride 4 and kernel size 4.
  4. If you don't use FC6 and FC7 ,i.e. pool5(global_AVE)+softmax(200), you may get 74+% acc. Otherwise, you can get 78% acc.
    5 When test on VAL set, just use central crop without mirror. Since it is default in caffe.

@simo23
Copy link
Author

simo23 commented Oct 9, 2017

Ok, thank you very much for the answer. I get the 78% now.

@youhebuke
Copy link

@simo23 @chenbinghui1
Hi,I got about 74.5% acc using pool5(global_ave,kernel size 28, stride 28)+FC(512x200)+softmax, just as chenbinghui1 said.
But I can't get 78% acc using pool5(kernel size 4, stride 4)+FC6+FC7+FC8new(4096x200)+softmax.I just got about 65% acc. I wonder where the problem is. Could you help me?
Really need your help, thank you.

@simo23
Copy link
Author

simo23 commented Oct 16, 2017

Hi, @youhebuke! The relevant details of my training are:

  • Last pooling layer modified to stride=4, kernel size =4 but still MAX pooling, not AVG
  • New layer initialized with biases=0 and weights= random gaussian with std dev = 0.01
  • Random 448 crop with random flip at training time
  • Central 448 crop at test time
  • Train the new FC layer with learning rate 1e-3 and all the other layers with learning rate 1e-4
  • Batch size = 32
  • L2 regularization on all weights, not biases, with decay=5e-4 as standard VGG
  • Preprocess both train and test images by subtracting the RGB means values of VGG. Be careful that you subtract the right value to the right channel. You must check the function that imports the images from file and be sure if the imported image is in RGB or GBR.

Let me know if this helps!

@super-wcg
Copy link

@simo23 Hi,did you train the RA-CNN?How did you define the loss?

@simo23
Copy link
Author

simo23 commented Oct 17, 2017

Hi @super-wcg, I did not train the RA-CNN sorry.

@chenfeima
Copy link

@simo23 Hello, I only get 75+ accuracy. Can you share me your train.prototxt?

@chenfeima
Copy link

@youhebuke Do you solve the problem? I also using pool5(kernel size 4, stride 4)+FC6+FC7+FC8new(4096x200)+softmax, but only get 75+ accuracy

@simo23
Copy link
Author

simo23 commented Nov 25, 2017

Hi @chenfeima, I do not use Caffe so I cannot share the prototxt, but the details are already written in an earlier anwer. Maybe you need to wait for a little longer?

@chenfeima
Copy link

@simo23 Tank you! Have you achieved the RA-CNN?What about your lank loss and the train strategy?

@simo23
Copy link
Author

simo23 commented Dec 1, 2017

Hi @chenfeima, I did not try to reproduce the RA-CNN sorry. By the way, there is now a more interesting work by the same team Multiattention

@chenfeima
Copy link

@simo23 That is more difficult. I want to achieve RA-CNN first. Do you have the rank loss?

@simo23
Copy link
Author

simo23 commented Dec 1, 2017

@chenfeima No, I did not implement it.

@RTMDFG
Copy link

RTMDFG commented Mar 2, 2018

@chenfeima Do you implement the RA-CNN?

@whyou5945
Copy link

@simo23 @chenbinghui1 @youhebuke thanks for your discussion, revealing the details of training VGG-19 on cub_bird dataset.
you mentioned "Random 448 crop" in the training process, you mean resizing the shorter side to 448 then crop 448 randomly?

@caoquanjie
Copy link

@simo23 Hi, could you help me ? I just got only about 65% acc using pool5(kernel size4, stride 4)+FC6+FC7+FC8(4096x200)+softmax. I just followed your training process as you said above,
and I achieved it with tensorflow, I don't know where the problem is , and I really need your help, thank you.

@simo23
Copy link
Author

simo23 commented Nov 20, 2018

Hi @caoquanjie,

there could be a million issues related to your training, so I am not sure what is going on. One of the things that maybe was missing that surely has a huge impact is the initialization. Do you start the training from scratch or from a pre-trained model on Imagenet?

@caoquanjie
Copy link

@simo23 thank you for your reply, I just solved this problem yesterday. I start the training process from a pre-trained model on Imagenet. First, I finetune the model using only fc8 with learning_rate of 1e-3 for 5000steps and then train all variables(including convolution variables) with learning_rate of 1e-3 for 10000steps.Finally, use the learning rate of 1e-4 to train 10,000 steps in the same way as before. Maybe the choice of optimizer is a problem, I chose SGD later and then I got 77.4% accuracy. Anyway, thank you for your reply.

@MubarkLa
Copy link

Hi, @youhebuke! The relevant details of my training are:

  • Last pooling layer modified to stride=4, kernel size =4 but still MAX pooling, not AVG
  • New layer initialized with biases=0 and weights= random gaussian with std dev = 0.01
  • Random 448 crop with random flip at training time
  • Central 448 crop at test time
  • Train the new FC layer with learning rate 1e-3 and all the other layers with learning rate 1e-4
  • Batch size = 32
  • L2 regularization on all weights, not biases, with decay=5e-4 as standard VGG
  • Preprocess both train and test images by subtracting the RGB means values of VGG. Be careful that you subtract the right value to the right channel. You must check the function that imports the images from file and be sure if the imported image is in RGB or GBR.

Let me know if this helps!

Hi, @simo23

May I ask whether you used any dropout layer in the vgg19 when finetuning on the bird dataset? Thank you.

@hamedbehzadi
Copy link

@simo23 thank you for your reply, I just solved this problem yesterday. I start the training process from a pre-trained model on Imagenet. First, I finetune the model using only fc8 with learning_rate of 1e-3 for 5000steps and then train all variables(including convolution variables) with learning_rate of 1e-3 for 10000steps.Finally, use the learning rate of 1e-4 to train 10,000 steps in the same way as before. Maybe the choice of optimizer is a problem, I chose SGD later and then I got 77.4% accuracy. Anyway, thank you for your reply.

Hi @caoquanjie
Can I ask you for some details. Did you reach such accuracy by using only different learning rates in multibple training procedure?Have you ever modified the architecture such as changing the pooling layer as discussed by others?

Thank you in advance for your attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants