Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Add {valid, full} option for Pooling layer #3392

Closed
wants to merge 2 commits into from

Conversation

Darwin2011
Copy link

@Darwin2011 Darwin2011 commented Sep 28, 2016

Mxnet's pooling layer's output doesn't match Caffe's pooling's output, although Mxnet's convolution layer's output shape is same as Caffe. This issue will cause Googlenet from caffe convertor failing.

The following links shows pooling output size formula.
for caffe,
https://github.com/BVLC/caffe/blob/master/src/caffe/layers/pooling_layer.cpp#L90
for mxnet,
https://github.com/dmlc/mxnet/blob/master/src/operator/pooling-inl.h#L202

And please see #2999. @winstywang @ChenFengAndy

So I add two options for pooling layer {valid, same}. valid is for current mxnet's pooling output and for caffe, it uses same option

  1. default pooling option is valid, same as current mxnet.
  2. if model is from caffe, same option for pooling layer will be on.

Please give some comments. Thanks a lot.

1) default pooling option is valid.
2) if model is from caffe, same option for pooling layer will be on.
@sxjscience
Copy link
Member

Why call it same?

@Darwin2011
Copy link
Author

@sxjscience
Maybe "full" is one better name? I will change it to "full". That makes more sense.

@sxjscience
Copy link
Member

@Darwin2011 Could we manually adjust the padding to achieve the similar result?

@Darwin2011
Copy link
Author

@sxjscience Could you please explain a little bit more? Thanks.

@sxjscience
Copy link
Member

@Darwin2011 I'm not sure whether this will break the C++ implementations.

@sxjscience
Copy link
Member

@antinucleon @piiswrong @winstywang What do you think about this? Some old caffemodels like VGG_CNN_M requires this type of compatible pooling. Another solution is to fine-tune these networks using the revised structure (pad 1 in the pooling layer) and add them to our own model zoo.

@Darwin2011
Copy link
Author

@sxjscience
Caffe model converter also cannot translate bvlc googlenet on my side.

@sxjscience
Copy link
Member

@Darwin2011 Yes, this is a problem. One way to deal with the problem is to add pad=(1,1) for lower level pooling layers like here (https://github.com/dmlc/mxnet/blob/master/example/image-classification/symbol_inception-bn-28-small.py#L19-L20). However, we may need to fine-tune the network again since the pooling function has been changed.
BTW, have you tested if the pooling layer works well for the SAME mode?

@Darwin2011
Copy link
Author

@sxjscience
I haven't tested it. What I only test is in "same" mode is on, Caffe convertor model can work well and mxnet can get same accuracy as Caffe. Let me test it with cifar10 training.

Thanks.

@piiswrong
Copy link
Contributor

piiswrong commented Sep 30, 2016

I think this should be added. Although the name "same" sounds cryptic. How about "compatible"?

BTW, put kValid first in the enum

@@ -25,13 +25,15 @@ namespace pool_enum {
enum PoolingOpInputs {kData};
enum PoolingOpOutputs {kOut};
enum PoolingOpType {kMaxPooling, kAvgPooling, kSumPooling};
enum PoolingOpPadConventionType {kSame, kValid};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put kValid first

@@ -47,6 +49,11 @@ struct PoolingParam : public dmlc::Parameter<PoolingParam> {
.add_enum("avg", pool_enum::kAvgPooling)
.add_enum("sum", pool_enum::kSumPooling)
.describe("Pooling type to be applied.");

DMLC_DECLARE_FIELD(pad_convention).set_default(pool_enum::kValid)
.add_enum("same", pool_enum::kSame)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change same to "full" or "compatible" and add document describing the difference (round up or down, discard extra or pad zero) and say that use full for compatibility with caffe

@sbodenstein
Copy link
Contributor

sbodenstein commented Sep 30, 2016

@piiswrong @Darwin2011 @sxjscience: These names are really bad. In the literature, 'same', 'valid' and 'full' have precise meanings for pooling and convolutions:

  • same: the tensor is padded so that the input and output tensors are the same width and height.
  • valid: no padding

This is used by TensorFlow or Numpy, for example. Using these names for this will cause confusion (esp. if MXnet decided to add these very useful options for convolution + pooling).

One option: have a "caffe_compatibility" (bool) option.

Here is a question that we should answer first: what is the actual computational difference in the pooling layer? Is the MXNet pooling layer equivalent to Caffe with some extra padding for certain input dimensions? If so, we can fix this at the Caffe importer level rather than hacking it into the MXNet pooling definition.

@piiswrong
Copy link
Contributor

the difference is basically round down vs up in division. So valid makes
sense. We just need to name the other one better.

On Sep 30, 2016 4:16 AM, "Sebastian Bodenstein" [email protected]
wrote:

@piiswrong https://github.com/piiswrong @Darwin2011
https://github.com/Darwin2011 @sxjscience
https://github.com/sxjscience: These names are really bad. In the
literature, 'same', 'valid' and 'full' have precise meanings for pooling
and convolutions in the literature:

  • same: the tensor is padded so that the input and output tensors are
    the same width and height.
  • valid: no padding

This is used by TensorFlow
https://www.tensorflow.org/versions/r0.10/api_docs/python/nn.html#convolution
or Numpy
http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html,
for example. Using these names for this will cause confusion (esp. if MXnet
decided to add these very useful options for convolution + pooling).

How about "caffe_compatibibility" which is a bool?

Here is a question that we should answer first: what is the actual
computational difference in the pooling layer? Is the MXNet pooling layer
equivalent to Caffe with some extra padding for certain input dimensions?
If so, we can fix this at the Caffe importer level rather than hacking it
into the MXNet pooling definition.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3392 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAiudGFZW_pavp69Y6cygNwmCsBW_WUnks5qvO-pgaJpZM4KIiS7
.

add some description
@Darwin2011
Copy link
Author

@sbodenstein Thanks for your advice.
Please see soumith/convnet-benchmarks#82.
Googlenet symbols fails because of different pooling output size conventions. That's why I add one more options to pooling layer.

@Darwin2011
Copy link
Author

Thanks a lot for your review, @piiswrong
renaming done and add some descriptions.

@Darwin2011 Darwin2011 changed the title Add {valid, same} option for Pooling layer Add {valid, full} option for Pooling layer Oct 1, 2016
@sxjscience
Copy link
Member

@Darwin2011 I think we can merge this in if the test passes.

@piiswrong
Copy link
Contributor

please update

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants