CPU

This section provides information specific to QNN CPU backend.

API Specializations

This section contains information related to API specialization for the CPU backend. All QNN CPU backend specialization is available under <QNN_SDK_ROOT>/include/CPU/ directory.

The current version of the QNN CPU backend API is:

QNN_CPU_API_VERSION_MAJOR 1
QNN_CPU_API_VERSION_MINOR 1
QNN_CPU_API_VERSION_PATCH 0

Supported Operations

QNN CPU supports running quantized 8-bit and float 32-bit networks on currently enabled Qualcomm chipsets and platform targets. List of operations supported by QNN CPU in quantized and floating point precision can be seen under Backend Support CPU column in Supported Operations.

Note that even though FP32 and INT8 are listed under separate columns in Supported Operations they are enabled by the same CPU backend library.

Op Package

The CPU Op Package provides interface to interact with OpPackage libraries registered with the CPU backend. More details about the interface could be found here File QnnCpuOpPackage.h.

Op Package Writing Guidelines

Detailed information regarding op package writing will be provided in a future release. In the meantime, please refer to the op package example which can be found in ${QNN_SDK_ROOT}/examples/OpPackage/CPU/.

Debug CallBack

Debug Callback is a feature that allows user to receive intermediate output as CPU backend execute it. QNN CPU provides a configuration option for users to enable it through client usage like below:

 1 QnnCpuGraph_CustomConfig_t customConfig;
 2 customConfig.option = QNN_CPU_GRAPH_CONFIG_OPTION_OP_DEBUG_CALLBACK;
 3 customConfig.cpuGraphOpDebug.cpuGraphOpDebugCallback = <QnnCpuGraph_OpDebugCallback_t funtion>;
 4 customConfig.cpuGraphOpDebug.callBackParam = <param to be returned with callback funtion>;
 5
 6 QnnGraph_Config_t graphConfig;
 7 graphConfig.option       = QNN_GRAPH_CONFIG_OPTION_CUSTOM;
 8 graphConfig.customConfig = &customConfig;
 9
10 const QnnGraph_Config_t* pGraphConfig[] = {&graphConfig, NULL};

Context Configs

QnnContext custom configs (see QnnCpuContext_CustomConfig_t) is supported.

Graph Configs

QnnGraph custom configs (see QnnCpuGraph_CustomConfig_t) is supported.

Qualcomm Matrix Extension (QMX)

To enable QMX kernels in CPU Backend, “use_qmx” option has been introduced in both the custom context config and the custom graph config. The “use_qmx” option in the custom context config enables or disables QMX for all the graphs within the context. To enable or disable QMX for a specific graph, “use_qmx” option must be set in that graph’s custom config. While using qnn-net-run to run a model, QMX flow is enabled or disabled through QNN CPU Backend extensions.

QNN CPU Backend Extensions

The QNN backend extension feature facilitates the use of backend specific APIs, namely custom configurations. More documentation on backend extensions can be found under qnn-net-run. The scope of QNN backend extensions is limited to qnn-net-run. QNN CPU Backend Extension is an interface that provides custom options to CPU Backend. To use backend extension related parameters with qnn-net-run, use the --config_file argument.

$ qnn-net-run --model <qnn_model_name.so> \
              --backend <path_to_backend_library>/libQnnCpu.so \
              --output_dir <output_dir_for_result> \
              --input_list <path_to_input_list.txt> \
              --config_file <path_to_config_file.json>

The config_file.json passed using the --config_file option should have the following schema:

{
    "backend_extensions" :
    {
        "shared_library_path" : "path to libQnnCpuNetRunExtensions.so",
        "config_file_path" : "path to backend_config.json"
    }
}

The CPU Backend Extensions custom configuration file i.e., backend_config.json includes context-level and graph-level configurations. Currently, this configuration supports enabling or disabling the QMX execution flow.

Configuration

option

type

Context config

use_qmx

boolean

Graph config

graph_name

string

use_qmx

boolean

The “use_qmx” option in the graph configuration takes precedence over the same option in the context configuration. If “use_qmx” is not specified in the graph configuration, the CPU Backend uses the value from the context configuration to determine whether to enable or disable the QMX flow. The schema for the CPU Backend Extensions custom configuration file is shown below:

{
  "type": "object",
  "properties": {
      "graphs": {
          "type": "array",
          "items": {
              "type": "object",
              "properties": {
                  "graph_name": {"type": "array", "items": {"type": "string" }},
                  "use_qmx": {"type": "boolean"}
              }
          }
      },
      "context": {
          "type": "array",
          "items": {
              "type": "object",
              "properties": {
                  "use_qmx": { "type": "boolean" }
              }
          }
      }
  }
}

An example config can be found at this path: ${QAIRT_SDK_ROOT}/examples/QNN/BackendExtensions/CPU/qnn_cpu_example_config.json.

Note

CPU Backend extension support is enabled for Android targets.