Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit feature guide > batch norm folding. #3746

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 20 additions & 15 deletions Docs/featureguide/bnf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,22 @@ Batch norm folding
Context
=======

Batch norm folding is a technique widely used in deep learning inference runtimes, including the |qnn|_.
Batch normalization layers are typically folded into the weights and biases of adjacent convolution layers whenever possible to eliminate unnecessary computations.
To accurately simulate inference in these runtimes, it is generally advisable to perform batch norm folding on the floating-point model before applying quantization.
Doing so not only results in a speedup in inferences per second by avoiding unnecessary computations but also often improves the accuracy of the quantized model by removing redundant computations and requantization.
We aim to simulate this on-target behavior by performing batch norm folding here.
Batch norm folding (BNF) is a technique widely used in deep learning inference runtimes, including |qnn|_.
In BNF, batch normalization layers are folded into the weights and biases of adjacent convolution layers where possible to eliminate unnecessary computations.

To accurately simulate inference in these runtimes, perform BNF on the floating-point model before applying quantization. Doing so not only speeds performance (inferences per second) but also often improves the accuracy of the quantized model by removing redundant computations and requantization. AIMET enables you to apply BNF to the pre-quantized model as a precursor to simulating this on-target behavior in the quantization simulation (QuantSim) model.

Workflow
========

Code example
------------
Procedure
---------

Step 1
~~~~~~

Load the model.

.. tab-set::
:sync-group: platform

Expand All @@ -32,7 +33,7 @@ Step 1

.. container:: tab-heading

Load the model for batch norm folding. In this code example, we will use MobileNetV2
This example uses the MobileNetV2 model.

.. literalinclude:: ../snippets/torch/apply_bnf.py
:language: python
Expand All @@ -58,7 +59,7 @@ Step 1

.. container:: tab-heading

Load the model for batch norm folding. In this code example, we will use MobileNetV2
This example uses the MobileNetV2 model.

.. literalinclude:: ../snippets/tensorflow/apply_bnf.py
:language: python
Expand Down Expand Up @@ -91,7 +92,7 @@ Step 1

.. container:: tab-heading

Load the model for batch norm folding. In this code example, we will convert PyTorch MobileNetV2 to ONNX and use it in the subsequent code
This example converts the PyTorch MobileNetV2 to ONNX and subsequently uses the ONNX model.

.. literalinclude:: ../snippets/onnx/apply_bnf.py
:language: python
Expand All @@ -115,6 +116,8 @@ Step 1
Step 2
~~~~~~

Prepare the model, if required by the model framework.

.. tab-set::
:sync-group: platform

Expand All @@ -130,7 +133,7 @@ Step 2

.. container:: tab-heading

AIMET provides TensorFlow `prepare_model` API, which performs preprocessing on the user model if necessary
AIMET provides the TensorFlow `prepare_model` API, which pre-processes the user model if necessary.

.. literalinclude:: ../snippets/tensorflow/apply_bnf.py
:language: python
Expand Down Expand Up @@ -163,7 +166,7 @@ Step 2

.. container:: tab-heading

It's recommended to simplify the ONNX model before applying AIMET functionalities
We recommend that you simplify the ONNX model as follows.

.. literalinclude:: ../snippets/onnx/apply_bnf.py
:language: python
Expand Down Expand Up @@ -194,6 +197,8 @@ Step 2
Step 3
~~~~~~

Perform the batch norm folding.

.. tab-set::
:sync-group: platform

Expand All @@ -202,7 +207,7 @@ Step 3

.. container:: tab-heading

Execute AIMET batch norm folding API
Execute the AIMET BNF API.

.. literalinclude:: ../snippets/torch/apply_bnf.py
:language: python
Expand Down Expand Up @@ -235,7 +240,7 @@ Step 3

.. container:: tab-heading

Execute AIMET batch norm folding API
Execute the AIMET BNF API.

.. literalinclude:: ../snippets/tensorflow/apply_bnf.py
:language: python
Expand Down Expand Up @@ -268,7 +273,7 @@ Step 3

.. container:: tab-heading

Execute AIMET batch norm folding API
Execute the AIMET BNF API.

.. literalinclude:: ../snippets/onnx/apply_bnf.py
:language: python
Expand Down
Loading