Qairt Quantizer

Note

This tool is still in a Beta release status.

The qairt-converter tool converts non-quantized models into a non-quantized DLC file. Quantizing requires another step. The qairt-quantizer tool is used to quantize the model to one of supported fixed point formats.

For example, the following command will convert an Inception v3 DLC file into a quantized Inception v3 DLC file.

$ qairt-quantizer --input_dlc inception_v3.dlc --input_list image_file_list.txt \
                  --output_dlc inception_v3_quantized.dlc

To properly calculate the ranges for the quantization parameters, a representative set of input data needs to be used as input into qairt-quantizer using the --input_list parameter. The input list specifies paths to raw image files used for quantization. For specifying --input_list, refer to input_list argument in snpe-net-run for supported input formats (in order to calculate output activation encoding information for all layers, do not include the line which specifies desired outputs).

The tool requires the batch dimension of the DLC input file to be set to 1 during model conversion. The batch dimension can be changed to a different value for inference, by resizing the network during initialization.

Additional details

  • qairt-quantizer is majorly similar to snpe-dlc-quant with the following differences:

    • External Overrides and Source Model Encodings (QAT) cached in Float DLC during Conversion stage are applied by default. Use the command line argument “–ignore_encodings” to ignore Overrides and Source Model Encodings and use Quantizer Runtime to generate encodings using calibration dataset provided through “–input_list”.

    • Float_Fallback feature: A command line option “–float_fallback” is added to enable this feature. When the command line option is specified, Qairt quantizer produces a fully quantized or mixed precision graph by applying encoding overrides or Source model encodings, propagate encodings accross Data invariant Ops and fallback the missing tensors in float datatype.

      Note: float_fallback and input_list are mutually exclusive options. One of them is mandatory for quantizer

  • Outputs can be specified for qairt-quantizer by modifying the input_list in the following ways:

    #<output_layer_name>[<space><output_layer_name>]
    %<output_tensor_name>[<space><output_tensor_name>]
    <input_layer_name>:=<input_layer_path>[<space><input_layer_name>:=<input_layer_path>]
    

    Note: Output tensors and layers can be specified individually, but when specifying both, the order shown must be used to specify each.

  • qairt-quantizer also supports quantization using AIMET, inplace of default Quantizer, when “–use_aimet_quantizer” command line option is provided. To use AIMET Quantizer, run the setup script to create AIMET specific environment, by executing the following command

    $ source {SNPE_ROOT}/bin/aimet_env_setup.sh --env_path <path where AIMET venv needs to be created> \
                                                --aimet_sdk_tar <AIMET Torch SDK tarball>
    
  • Advance AIMET algorithms- AdaRound and AMP is also supported in qairt-quantizer. The user needs to provide a YAML config file through the command line option “–config_file” and specify the algorithm “adaround” or “amp” through “–algorithms” along with “–use_aimet_quantizer” flag.

  • The template of the YAML file for AMP is shown below:

    aimet_quantizer:
       datasets:
           <dataset_name>:
               dataloader_callback: '<path/to/unlabled/dataloader/callback/function>'
               dataloader_kwargs: {arg1: val, arg2: val2}
    
       amp:
           dataset: <dataset_name>,
           candidates:  [[[8, 'int'], [16, 'int']], [[16, 'float'], [16, 'float']]],
           allowed_accuracy_drop: 0.02
           eval_callback_for_phase2: '<path/to/evaluator/callback/function>'
    

dataloader_callback is used to set the path of a callback function which returns labeled dataloader of type torch.DataLoader. The data should be in source network input format. dataloader_kwargs is an optional dictionary through which the user can provide keyword arguments of the above defined callback function. dataset is used to specify the name of the dataset that has been defined above. candidates is list of lists for all possible bitwidth values for activations and parameters. allowed_accuracy_drop is used to specify the maximum allowed drop in accuracy from FP32 baseline. The pareto front curve is plotted only till the point where the allowable accuracy drop is met. eval_callback_for_phase2 is used to set the path of the evaluator function which takes predicted value batch as the first argument and ground truth batch as the second argument and returns calculated metric float value.

  • The template of the YAML file for AdaRound is shown below:

    aimet_quantizer:
        datasets:
            <dataset_name>:
                dataloader_callback: '<path/to/unlabled/dataloader/callback/function>'
                dataloader_kwargs: {arg1: val, arg2: val2}
    
        adaround:
            dataset: <dataset_name>
            num_batches: 1
    

dataloader_callback is used to set the path of a callback function which returns unlabeled dataloader of type torch.DataLoader. The data should be in source network input format. dataloader_kwargs is an optional dictionary through which the user can provide keyword arguments of the above defined callback function. dataset is used to specify the name of the dataset that has been defined above. num_batches is used to specify the number of batches to be used for adaround iteration.

  • AdaRound can also run in default mode, without config file, by just passing “adaround” in the command line option “–algorithms” along with “–use_aimet_quantizer” flag. This flow uses the data provided through the input_list option to take rounding decisions.

    Note:
    1. AIMET Torch Tarball naming convention should be as follows - aimetpro-release-<VERSION (optionally with build ID)>.torch-<cpu/gpu>-.*.tar.gz. For example, aimetpro-release-x.xx.x.torch-xxx-release.tar.gz.

    2. Once the setup script is run, ensure that AIMET_ENV_PYTHON environment variable is set to <AIMET virtual environment path>/bin/python

    3. Minimum AIMET version supported is, AIMET-1.33.0