Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit feature guide > cross-layer equalization. #3749

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 28 additions & 15 deletions Docs/featureguide/cle.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,30 +6,41 @@ Cross-layer equalization

Context
=======
Quantization of floating point models into lower bitwidths introduces quantization noise on the weights and activations, which often leads to reduced model performance. To minimize quantization noise, there are a variety of post training quantization (PTQ) techniques offered by AIMET. You can learn more about these techniques `here <https://arxiv.org/pdf/1906.04721>`_.
Quantization of floating-point models into lower bitwidths introduces quantization noise on the weights and activations, which often reduces model performance. To minimize quantization noise, AIMET recommends a :ref:`quantization workflow <opt-guide-quantization-workflow>` that includes a variety of post training quantization (PTQ) techniques. You can learn more about these techniques `here <https://arxiv.org/pdf/1906.04721>`_.

AIMET's cross-layer equalization tool involves the following techniques:
AIMET includes a cross-layer equalization (CLE) tool that applies the following PTQ techniques:

- **Batch Norm Folding**: This feature folds batch norm layers into adjacent convolutional and linear layers. You can learn more :ref:`here <featureguide-bnf>`
Batch Norm Folding
This feature folds batch norm layers into adjacent convolutional and linear layers. For more on BNF see :ref:`Batch norm folding <featureguide-bnf>`.

- **Cross Layer Scaling**: In some models, the parameter ranges for different channels in a layer show a wide variance (as shown below). This feature attempts to equalize the distribution of weights per channel of consecutive layers. Thus, different channels have a similar range and the same quantization parameters can be used for weights across all channels.
Cross Layer Scaling
In some models, the parameter ranges for different channels in a layer show a wide variance. See the first chart in the following figure.

Cross-layer scaling attempts to equalize the distribution of weights per channel of consecutive layers. This gives different channels a similar range so that the same quantization parameters can be used for weights across all channels. See the second chart in the figure.

.. figure:: ../images/cross_layer_scaling.png

- **High Bias Fold**: Cross layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer's parameters. This feature requires batch norm parameters to operate on and will not be applied otherwise.
High Bias Fold
Cross layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer's parameters. This feature requires batch norm parameters to operate on and is not applied otherwise.

Workflow
========

Setup
~~~~~~

Load the model.

.. tab-set::
:sync-group: platform

.. tab-item:: PyTorch
:sync: torch


.. container:: tab-heading

This code example uses MobileNetV2.

.. literalinclude:: ../snippets/torch/apply_cle.py
:start-after: [setup]
:end-before: [step_1]
Expand All @@ -39,9 +50,9 @@ Setup

.. container:: tab-heading

Load the model for cross-layer equalization. In this code example, we will use MobileNetV2.
This code example uses MobileNetV2.

It's recommended to apply the TensorFlow `prepare_model` API before applying AIMET functionalities. After preparation, we find that the model contains consecutive convolutions, which can be optimized through cross-layer equalization.
We recommend applying the TensorFlow `prepare_model` API before applying AIMET functionalities. After preparation the model contains consecutive convolutions, which can be optimized through cross-layer equalization.

.. literalinclude:: ../snippets/tensorflow/apply_cle.py
:language: python
Expand Down Expand Up @@ -103,9 +114,9 @@ Setup

.. container:: tab-heading

Load the model for cross-layer equalization. In this code example, we will convert PyTorch MobileNetV2 to ONNX and use it in the subsequent code.
Load the model for cross-layer equalization. This example converts PyTorch MobileNetV2 to ONNX and uses it in the subsequent code.

It's recommended to simplify the ONNX model before applying AIMET functionalities. After simplification, we find that the model contains consecutive convolutions, which can be optimized through cross-layer equalization.
We recommend simplifying the ONNX model before applying AIMET functionalities. After simplification, the model contains consecutive convolutions, which can be optimized through cross-layer equalization.

.. literalinclude:: ../snippets/onnx/apply_cle.py
:language: python
Expand Down Expand Up @@ -151,16 +162,18 @@ Setup
[[ 4.35139937e-03]]
[[ 2.57021021e-02]]]]

Step 1
~~~~~~
Execution
~~~~~~~~~

Execute AIMET cross-layer equalization API
Apply cross-layer equalization.

.. tab-set::
:sync-group: platform

.. tab-item:: PyTorch
:sync: torch

Execute the AIMET cross-layer equalization API function.

.. literalinclude:: ../snippets/torch/apply_cle.py
:language: python
Expand All @@ -171,7 +184,7 @@ Execute AIMET cross-layer equalization API

.. container:: tab-heading

Execute AIMET cross-layer equalization API
Execute the AIMET cross-layer equalization API function.

.. literalinclude:: ../snippets/tensorflow/apply_cle.py
:language: python
Expand Down Expand Up @@ -207,7 +220,7 @@ Execute AIMET cross-layer equalization API

.. container:: tab-heading

Execute AIMET cross-layer equalization API
Execute the AIMET cross-layer equalization API function.

.. literalinclude:: ../snippets/onnx/apply_cle.py
:language: python
Expand Down
Loading