Why do not update batchnorm mean and var during training? #31

RichardChangCA · 2022-01-31T19:08:32Z

Hello, Thanks for your source codes.

Could I ask Why do not update batchnorm mean and var during training?
affine=False means do not update batch normalization parameters.

Thanks

Callidior · 2022-08-12T11:28:14Z

affine=False means that the normalization step of batch norm will not be followed by a linear scaling and offset of the form a * x + b. The argument affine only relates to these affine learnable parameters, not to mean and standard deviation of the normalization, which are still learned.

For Deep SVDD, it is crucial to disable these affine transformations, as stated in section 3.3 of the paper:

Put differently, Proposition 2 implies that networks with bias terms can easily learn any constant function, which is independent of the input x ∈ X . It follows that bias terms should not be used in neural networks with Deep SVDD since the network can learn the constant function mapping directly to the hypersphere center, leading to hypersphere collapse.

Intuitively, if your network contains a bias term, the last layer could just learn to set all weights to zero and the bias to the center c, mapping everything to the center without even taking the input data into account.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why do not update batchnorm mean and var during training? #31

Why do not update batchnorm mean and var during training? #31

RichardChangCA commented Jan 31, 2022

Callidior commented Aug 12, 2022

Why do not update batchnorm mean and var during training? #31

Why do not update batchnorm mean and var during training? #31

Comments

RichardChangCA commented Jan 31, 2022

Callidior commented Aug 12, 2022