Paper's VGG-19 accuracy question #8

simo23 · 2017-09-20T17:03:49Z

Hi, first of all thanks for your great work!

In your paper you cite the VGG-19 [27] model and state that on the CUB-200-2011 dataset it achieves 77.8% accuracy. Can you please give some more info about this? Are you referring to the only Imagenet trained model? Or on the fine-tuned by you model? Or fine-tuned by someone else model? Is it the Caffe model?

And if you did train it can you share some of the details like batch size, learning rate, epochs of the training, data augmentation?

Thanks,
Andrea

simo23 · 2017-09-30T12:50:56Z

Hi, thanks for your answer. I'm already performing isotropic scaling on the images and random crop unfortunately. If you don't mind I have some questions: - Which framework do you use? - Do you train with 448x448 images from scratch or from a vgg19 already trained on CUB with smaller input? - If you train it with 448x448 from the beginning, how do you compute the loss? Which stride do you apply to the layers after the conv5? - Do you take the output of the network, which in case of a stride 1 with 448x448 image will be a 8x8x200, make the average to get a 1x200, apply sofmax and cross entropy or something else? - When you compute the validation accuracy do you still use random crop, central crop or something else? Thanks

…

On Sat, Sep 30, 2017 at 3:20 AM, bhchen ***@***.***> wrote: @simo23 <https://github.com/simo23> Hi. maybe i can answer your question, the important thing is data preprocessing. I suggest you normalize the shortest edge of the original image to 512 and keep the original aspect ratio. Then use a random crop one of size 448*448 during training. I use the original VGG19-model and achieve 78.3% acc on CUB. Good luck to you. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ASvIdGh209c7YjQEprj9R7sgtTFziGLLks5snZdagaJpZM4PeLul> .

chenbinghui1 · 2017-10-09T02:50:38Z

@simo23 Hi
To your questions：

I use VGG19 pretrainted on imagenet
I fine-tune the VGG19 model on CUB with 448x448
When fine-tune with 448 I change the last pooling layer to stride 4 and kernel size 4.
If you don't use FC6 and FC7 ,i.e. pool5(global_AVE)+softmax(200), you may get 74+% acc. Otherwise, you can get 78% acc.
5 When test on VAL set, just use central crop without mirror. Since it is default in caffe.

simo23 · 2017-10-09T12:52:22Z

Ok, thank you very much for the answer. I get the 78% now.

youhebuke · 2017-10-16T08:12:18Z

@simo23 @chenbinghui1
Hi,I got about 74.5% acc using pool5(global_ave,kernel size 28, stride 28)+FC(512x200)+softmax, just as chenbinghui1 said.
But I can't get 78% acc using pool5(kernel size 4, stride 4)+FC6+FC7+FC8new(4096x200)+softmax.I just got about 65% acc. I wonder where the problem is. Could you help me?
Really need your help, thank you.

simo23 · 2017-10-16T09:08:29Z

Hi, @youhebuke! The relevant details of my training are:

Last pooling layer modified to stride=4, kernel size =4 but still MAX pooling, not AVG
New layer initialized with biases=0 and weights= random gaussian with std dev = 0.01
Random 448 crop with random flip at training time
Central 448 crop at test time
Train the new FC layer with learning rate 1e-3 and all the other layers with learning rate 1e-4
Batch size = 32
L2 regularization on all weights, not biases, with decay=5e-4 as standard VGG
Preprocess both train and test images by subtracting the RGB means values of VGG. Be careful that you subtract the right value to the right channel. You must check the function that imports the images from file and be sure if the imported image is in RGB or GBR.

Let me know if this helps!

super-wcg · 2017-10-16T11:37:35Z

@simo23 Hi,did you train the RA-CNN?How did you define the loss?

simo23 · 2017-10-17T08:41:48Z

Hi @super-wcg, I did not train the RA-CNN sorry.

chenfeima · 2017-11-24T07:02:07Z

@simo23 Hello, I only get 75+ accuracy. Can you share me your train.prototxt?

chenfeima · 2017-11-24T12:24:42Z

@youhebuke Do you solve the problem? I also using pool5(kernel size 4, stride 4)+FC6+FC7+FC8new(4096x200)+softmax, but only get 75+ accuracy

simo23 · 2017-11-25T11:55:01Z

Hi @chenfeima, I do not use Caffe so I cannot share the prototxt, but the details are already written in an earlier anwer. Maybe you need to wait for a little longer?

chenfeima · 2017-12-01T02:28:07Z

@simo23 Tank you! Have you achieved the RA-CNN？What about your lank loss and the train strategy?

simo23 · 2017-12-01T07:44:08Z

Hi @chenfeima, I did not try to reproduce the RA-CNN sorry. By the way, there is now a more interesting work by the same team Multiattention

chenfeima · 2017-12-01T07:47:59Z

@simo23 That is more difficult. I want to achieve RA-CNN first. Do you have the rank loss?

simo23 · 2017-12-01T08:23:20Z

@chenfeima No, I did not implement it.

RTMDFG · 2018-03-02T02:41:27Z

@chenfeima Do you implement the RA-CNN?

whyou5945 · 2018-03-21T06:11:03Z

@simo23 @chenbinghui1 @youhebuke thanks for your discussion, revealing the details of training VGG-19 on cub_bird dataset.
you mentioned "Random 448 crop" in the training process, you mean resizing the shorter side to 448 then crop 448 randomly?

caoquanjie · 2018-11-09T07:38:42Z

@simo23 Hi, could you help me ? I just got only about 65% acc using pool5(kernel size4, stride 4)+FC6+FC7+FC8(4096x200)+softmax. I just followed your training process as you said above,
and I achieved it with tensorflow, I don't know where the problem is , and I really need your help, thank you.

simo23 · 2018-11-20T08:22:18Z

Hi @caoquanjie,

there could be a million issues related to your training, so I am not sure what is going on. One of the things that maybe was missing that surely has a huge impact is the initialization. Do you start the training from scratch or from a pre-trained model on Imagenet?

caoquanjie · 2018-11-20T08:49:17Z

@simo23 thank you for your reply, I just solved this problem yesterday. I start the training process from a pre-trained model on Imagenet. First, I finetune the model using only fc8 with learning_rate of 1e-3 for 5000steps and then train all variables(including convolution variables) with learning_rate of 1e-3 for 10000steps.Finally, use the learning rate of 1e-4 to train 10,000 steps in the same way as before. Maybe the choice of optimizer is a problem, I chose SGD later and then I got 77.4% accuracy. Anyway, thank you for your reply.

MubarkLa · 2019-04-22T07:12:47Z

Hi, @youhebuke! The relevant details of my training are:

Last pooling layer modified to stride=4, kernel size =4 but still MAX pooling, not AVG

New layer initialized with biases=0 and weights= random gaussian with std dev = 0.01

Random 448 crop with random flip at training time

Central 448 crop at test time

Train the new FC layer with learning rate 1e-3 and all the other layers with learning rate 1e-4

Batch size = 32

L2 regularization on all weights, not biases, with decay=5e-4 as standard VGG

Preprocess both train and test images by subtracting the RGB means values of VGG. Be careful that you subtract the right value to the right channel. You must check the function that imports the images from file and be sure if the imported image is in RGB or GBR.

Let me know if this helps!

Hi, @simo23

May I ask whether you used any dropout layer in the vgg19 when finetuning on the bird dataset? Thank you.

hamedbehzadi · 2021-08-30T11:45:03Z

@simo23 thank you for your reply, I just solved this problem yesterday. I start the training process from a pre-trained model on Imagenet. First, I finetune the model using only fc8 with learning_rate of 1e-3 for 5000steps and then train all variables(including convolution variables) with learning_rate of 1e-3 for 10000steps.Finally, use the learning rate of 1e-4 to train 10,000 steps in the same way as before. Maybe the choice of optimizer is a problem, I chose SGD later and then I got 77.4% accuracy. Anyway, thank you for your reply.

Hi @caoquanjie
Can I ask you for some details. Did you reach such accuracy by using only different learning rates in multibple training procedure?Have you ever modified the architecture such as changing the pooling layer as discussed by others?

Thank you in advance for your attention.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paper's VGG-19 accuracy question #8

Paper's VGG-19 accuracy question #8

simo23 commented Sep 20, 2017

simo23 commented Sep 30, 2017 via email

chenbinghui1 commented Oct 9, 2017

simo23 commented Oct 9, 2017 •

edited

Loading

youhebuke commented Oct 16, 2017

simo23 commented Oct 16, 2017

super-wcg commented Oct 16, 2017

simo23 commented Oct 17, 2017

chenfeima commented Nov 24, 2017

chenfeima commented Nov 24, 2017

simo23 commented Nov 25, 2017 •

edited

Loading

chenfeima commented Dec 1, 2017

simo23 commented Dec 1, 2017

chenfeima commented Dec 1, 2017

simo23 commented Dec 1, 2017

RTMDFG commented Mar 2, 2018

whyou5945 commented Mar 21, 2018

caoquanjie commented Nov 9, 2018

simo23 commented Nov 20, 2018

caoquanjie commented Nov 20, 2018

MubarkLa commented Apr 22, 2019

hamedbehzadi commented Aug 30, 2021

Paper's VGG-19 accuracy question #8

Paper's VGG-19 accuracy question #8

Comments

simo23 commented Sep 20, 2017

simo23 commented Sep 30, 2017 via email

chenbinghui1 commented Oct 9, 2017

simo23 commented Oct 9, 2017 • edited Loading

youhebuke commented Oct 16, 2017

simo23 commented Oct 16, 2017

super-wcg commented Oct 16, 2017

simo23 commented Oct 17, 2017

chenfeima commented Nov 24, 2017

chenfeima commented Nov 24, 2017

simo23 commented Nov 25, 2017 • edited Loading

chenfeima commented Dec 1, 2017

simo23 commented Dec 1, 2017

chenfeima commented Dec 1, 2017

simo23 commented Dec 1, 2017

RTMDFG commented Mar 2, 2018

whyou5945 commented Mar 21, 2018

caoquanjie commented Nov 9, 2018

simo23 commented Nov 20, 2018

caoquanjie commented Nov 20, 2018

MubarkLa commented Apr 22, 2019

hamedbehzadi commented Aug 30, 2021

simo23 commented Oct 9, 2017 •

edited

Loading

simo23 commented Nov 25, 2017 •

edited

Loading