CPU Fixed Point Mode

Introduction

The Qualcomm® Neural Processing SDK supports quantized fixed-point model execution on CPU runtime. Similar to the case of quantized models with DSP runtime, fixed-point execution provides better performance at the mild expense of accuracy. To maintain backward compatibility, Qualcomm® Neural Processing SDK will continue to dequantize models from quantized format to float 32 format in CPU runtime unless the fixed-point execution mode is explicitly enabled. Multiple models can run in parallel (across CPU, GPU, DSP) independently, with models specified for CPU fixed-point mode alone executing in the mode.

The following sections demonstrate the CPU Fixed Point Mode support in Qualcomm® Neural Processing SDK:

  1. APIs for CPU Fixed Point Mode

    1. C API

    2. C++ API

    3. Java API

  2. Running snpe-net-run command-line tool with quantized DLC

APIs for CPU Fixed Point Mode

The section demonstrates the Qualcomm® Neural Processing SDK CPU Fixed Point Mode APIs. Qualcomm® Neural Processing SDK provides C API, C++ API, and Java API for enabling this execution mode.

C API

The section demonstrates the available Qualcomm® Neural Processing SDK CPU Fixed Point Mode API for C. The following header needs to be included to use the CPU Fixed Point functionality.

#include "SNPE/SNPEBuilder.h"

For a complete example, please refer to the sample application file located at $SNPE_ROOT/examples/SNPE/NativeCpp/SampleCode_CAPI/ITensor/main.cpp .

The Qualcomm® Neural Processing SDK CPU Fixed Point Mode for C can be enabled by,

Snpe_ErrorCode_t Snpe_SNPEBuilder_SetCpuFixedPointMode(Snpe_SNPEBuilder_Handle_t snpeBuilderHandle, bool cpuFxpMode);

The params -

  • snpeBuilderHandle Handle to access the SNPEBuilder object.

  • cpuFxpMode boolean if set to true, enables the fixed point mode.

C++ API

The section demonstrates the available Qualcomm® Neural Processing SDK CPU Fixed Point Mode API for C++. The following header needs to be included to use the CPU Fixed Point functionality.

#include "SNPE/SNPEBuilder.hpp"

For a complete example, please refer to the sample application file located at $SNPE_ROOT/examples/SNPE/NativeCpp/SampleCode_CPP/main.cpp .

The Qualcomm® Neural Processing SDK CPU Fixed Point Mode for C++ can be enabled by,

zdl::SNPE::SNPEBuilder::setCpuFixedPointMode(bool cpuFxpMode);

The param cpuFxpMode boolean if set to true, enables the fixed point mode.

Java API

The section demonstrates the available Qualcomm® Neural Processing SDK CPU Fixed Point Mode APIs for Java. The following packages needs to be imported to use the CPU Fixed Point Mode functionality.

import com.qualcomm.qti.snpe.SNPE;
import com.qualcomm.qti.snpe.NeuralNetwork;

For a complete example, please refer to the sample android application files located at $SNPE_ROOT/examples/SNPE/android/image-classifiers/app/src/.

The Qualcomm® Neural Processing SDK CPU Fixed Point Mode for Java can be enabled by,

SNPE.NeuralNetworkBuilder.setCpuFixedPointMode(boolean fxpMode);

The param fxpMode boolean if set to true, enables the fixed point mode. Returns the current instance of NeuralNetworkBuilder.

Note

  • If the CPU fixed-point mode is enabled, Qualcomm® Neural Processing SDK expects a quantized DLC to be provided and CPU runtime to be selected.

  • Not every layer/op is currently enabled for CPU fixed-point execution. The user can continue to execute the model in CPU floating-point mode if the layer/op is not supported on CPU fixed-point mode.

  • DSP can fallback to CPU fixed-point mode provided fallback and CPU fixed-point execution mode are both enabled. Unlike, the case with CPU floating-point mode, there will be no dequantization step when translating between runtimes - which may lead to performance increase.

Running snpe-net-run command-line tool with quantized DLC

This section outlines the use of snpe-net-run command line tool with a quantized DLC. The use of snpe-net-run is largely unchanged from its typical usage. The snpe-net-run tool enables cpu fixed point mode through the command line option –enable_cpu_fxp.

Example usage is as follows:

snpe-net-run --container <path_to_quantized_dlc> --input_list <path_to_input_list> --enable_cpu_fxp