Quantizing a Model

Each of the snpe-framework-to-dlc conversion tools convert non-quantized models into a non-quantized DLC file. Quantizing requires another step. The snpe-dlc-quantize tool is used to quantize the model to one of supported fixed point formats.

For example, the following command will convert an Inception v3 DLC file into a quantized Inception v3 DLC file.

snpe-dlc-quantize --input_dlc inception_v3.dlc --input_list image_file_list.txt
                  --output_dlc inception_v3_quantized.dlc

The image list specifies paths to raw image files used for quantization. See snpe-dlc-quantize for more details.

The tool requires the batch dimension of the DLC input file to be set to 1 during model conversion. The batch dimension can be changed to a different value for inference, by resizing the network during initialization.

For details on the quantization algorithm, and information on when to use a quantized model, see Quantized vs Non-Quantized Models.

Input data for quantization

To properly calculate the ranges for the quantization parameters, a representative set of input data needs to be used as input into snpe-dlc-quantize.

Experimentation shows that providing 5-10 input data examples in the input_list for snpe-dlc-quantize is usually sufficient, and definitely practical for quick experiments. For more robust quantization results, we recommend providing 50-100 examples of representative input data for the given model use case, without using data from the training set. The representative input data set ideally should include all input data modalities which represent/produce all the output types/classes of the model, preferably with several input data examples per output type/class.

In Supported Network Layers, we have listed the layers/ops that are guaranteed to be quantized successfully. For other layers/ops no guarantees can be made.