Skip to content

Commit

Permalink
Streamline images in AIMET docs (#3727)
Browse files Browse the repository at this point in the history
* Streamlining images in docs

Signed-off-by: Priyanka Dangi <[email protected]>
  • Loading branch information
quic-pdangi authored Jan 14, 2025
1 parent 72e22e9 commit 57c38ea
Show file tree
Hide file tree
Showing 14 changed files with 19 additions and 19 deletions.
4 changes: 2 additions & 2 deletions Docs/featureguide/autoquant.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ Workflow

The workflow looks like this:

.. image:: ../../images/auto_quant_v2_flowchart.png

.. image:: ../images/auto_quant_1.png
:height: 450
Before entering the optimization workflow, AutoQuant prepares by:

1. Checking the validity of the model and converting the model into an AIMET quantization-friendly format (`Prepare Model`).
Expand Down
10 changes: 5 additions & 5 deletions Docs/featureguide/mixed precision/amp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,16 @@ allowable accuracy drop, is passed to the API.
The function changes the QuantSim Sim model in place with different quantizers having different
bit-widths. This QuantSim model can be either exported or evaluated to get a quantization accuracy.

.. image:: ../../images/work_flow_amp.png
.. image:: ../../images/automatic_mixed_precision_1.png
:width: 900px

Mixed Precision Algorithm
=========================

The algorithm involves 4 phases:

.. image:: ../../images/stages.png
:width: 150px
.. image:: ../../images/automatic_mixed_precision_2.png
:width: 700px

1) Find layer groups
--------------------
Expand All @@ -45,8 +45,8 @@ The algorithm involves 4 phases:
This helps in reducing search space over which the mixed precision algorithm operates.
It also ensures that we search only over the valid bit-width settings for parameters and activations.

.. image:: ../../images/quantizer_groups.png
:width: 900px
.. image:: ../../images/automatic_mixed_precision_3.png
:width: 900px

2) Perform sensitivity analysis (Phase 1)
-----------------------------------------
Expand Down
Binary file added Docs/images/auto_quant_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/automatic_mixed_precision_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/automatic_mixed_precision_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/automatic_mixed_precision_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/debugging_guidelines_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/quantization_workflow_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/quantization_workflow_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/quantization_workflow_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/quantization_workflow_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Docs/images/quantization_workflow_5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions Docs/userguide/debugging_guidelines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@ Debugging workflow

The steps are shown as a flow chart in the following figure and are described in more detail below:

.. image:: ../images/quantization_debugging_flow_chart.png
:height: 800
:width: 700
.. image:: ../images/debugging_guidelines_1.png
:height: 500

1. FP32 confidence check

1. FP32 confidence checks
------------------------

First, ensure that the floating-point and quantized model behave similarly in the forward pass,
Expand Down
16 changes: 8 additions & 8 deletions Docs/userguide/quantization_workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@ without requiring actual quantized hardware.

A quantization simulation workflow is illustrated here:

.. image:: ../images/quant_use_case_1.PNG

2. Post-training quantization
-----------------------------
.. image:: ../images/quantization_workflow_1.png
2. Post-training quantization (PTQ):
------------------------------------

Post-training quantization (PTQ) techniques make a model more quantization-friendly without requiring model retraining
or fine-tuning. PTQ is recommended as a go-to tool in a quantization workflow because:
Expand All @@ -37,7 +37,7 @@ or fine-tuning. PTQ is recommended as a go-to tool in a quantization workflow be

The PTQ workflow is illustrated here:

.. image:: ../images/quant_use_case_3.PNG
.. image:: ../images/quantization_workflow_2.png

3. Quantization-aware training
------------------------------
Expand All @@ -55,7 +55,7 @@ but it can provide better accuracy, especially at lower bit-widths.

A typical QAT workflow is illustrated here:

.. image:: ../images/quant_use_case_2.PNG
.. image:: ../images/quantization_workflow_3.png

Supported precisions for on-target inference
============================================
Expand Down Expand Up @@ -108,7 +108,7 @@ lowering the precision.
The figure below illustrates the recommended quantization workflow and the steps required
to deploy the quantized model on the target device.

.. figure:: ../images/overall_quantization_workflow.png
.. figure:: ../images/quantization_workflow_4.png

Recommended quantization workflow

Expand Down Expand Up @@ -144,7 +144,7 @@ If the off-target quantized accuracy metric is not meeting expectations, you can
techniques to improve the quantized accuracy for the desired precision. The decision between
PTQ and QAT should be based on the quantized accuracy and runtime needs.

.. image:: ../images/quantization_workflow.png
.. image:: ../images/quantization_workflow_5.png

Once the off-target quantized accuracy metric is satisfactory, proceed to :ref:`evaluate the
on-target metrics<opt-guide-on-target-inference>` at this precision. If the on-target metrics
Expand Down

0 comments on commit 57c38ea

Please sign in to comment.