Tutorial: Converting and executing a CNN model with custom operations

The following tutorial will demonstrate the end to end usage of QNN Tools and the QNN API in the context of user-created custom operations. By Custom Operations, we mean operations that are defined in an operation definition configuration file in tandem with a user defined QNN op package library for execution.

This process begins with a trained source framework model containing such operations, along with a config file filled with accompanying definitions, to be converted and built into a series of QNN API calls using one of the available QNN Converters. Additionally, the same config file will be used to create an op package library skeleton via the qnn-op-package-generator . The completed skeleton and QNN model source files become shared libraries compiled for a specific target which can then execute on a particular backend.

The tutorial will use Inception V3 as the source framework model and the qnn-net-run executable as the example application. The execution will show usage on the CPU, DSP and HTP backends on both host (for CPU and HTP) and device.

The sections of the tutorial are as follows:

  1. Tutorial Setup

  2. Custom Op XML Config Creation

  3. Creating a QNN Custom Op Package

  4. Model Conversion

  5. Model Build

  6. Model Execution

Note

If developing on Windows, the following sections must be executed in the WSL (x86) environment: Tutorial Setup, Creating a QNN Custom Op Package, and Model Conversion. The Model Build and Model Execution sections, however, should be executed on Windows natively. Please refer to Integration Workflow on Windows to see more details of the workflow.

Tutorial Setup

The tutorial assumes general setup instructions have been followed at Setup.

Additionally, this tutorial requires the acquisition of the Inception V3 Tensorflow model file and sample images. This is handled by the provided setup script setup_inceptionv3.py. The script is located at:

${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py

Usage is as follows:

usage: setup_inceptionv3.py [-h] -a ASSETS_DIR [-c] [-cu] [-d]
                            [-g {cpu,dsp,htp}] [-q]

Prepares the inception_v3 assets for tutorial examples.

required arguments:
  -a ASSETS_DIR, --assets_dir ASSETS_DIR
                    directory containing the inception_v3 assets

optional arguments:
  -c, --convert_model   Convert and compile model once acquired.
  -cu, --custom         Convert the model using Relu as a custom operation.
                        Only available if --c or --convert_model option is
                        chosen
  -d, --download        Download inception_v3 assets to inception_v3 example
                        directory
  -g {cpu,dsp,htp}, --generate_packages {cpu,dsp,htp}
                        Generate and compile custom op packages for HTP, CPU
                        and DSP
  -q, --quantize_model  Quantize the model during conversion. Only available
                        if --c or --convert_model option is chosen

To run the script use:

$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d

This will populate the model file at:

${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb

And the raw images at:

${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped

Note

If developing on Windows, please run the above steps on WSL (x86).

Custom Op XML Config Creation

A custom operation is defined in QNN as an XML file containing a description of its inputs, outputs and attributes according to an XML schema outlined at XML OpDef Schema Breakdown.

Instructions on writing configs can be found at XML OpDef Schema Breakdown with examples shown at Example XML Op Def Configs.

In the SDK, users can find working examples at:

${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator

Given a custom op config file defining a operation and a model containing it, any of the available qnn converters will produce the model artifacts. Similarly, the config file can be provided to the qnn-op-package-generator tool to produce a QNN op package skeleton.

In this tutorial, the following XML config files can be used with the Inceptionv3 model obtained above:

${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageHtp.xml
${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageCpu.xml
${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageDsp.xml

Creating a QNN Custom Op Package

Creating a custom op package comprises of package generation, implementation and compilation into shared libraries. Although users can create their own packages manually, we highly recommend this workflow to avoid unexpected errors. We will cover the steps in the following sections.

Generating a Custom Op Package Skeleton

For the following section, it is assumed that setup instructions have been run and the qnn-op-package-generator tool can be used. In this tutorial, packages can be generated for the CPU, DSP and HTP backends respectively.

Note

Packages for CPU backends can be generated on a Windows host. Details are provided below.

To generate the CPU package on Linux, run:

$  ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator \
   -p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageCpu.xml     \
   -o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU # this can be any path

The above command will produce the following artifacts:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage

To generate the CPU package on Windows, open WSL (x86) and run:

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator \
  -p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageCpu.xml \
  -o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU \
  --gen_cmakelists

The above command will produce the following artifacts:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage

To generate the DSP package:

$  ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator \
   -p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageDsp.xml     \
   -o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP # this can be any path

The above command will produce the following artifacts: * ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage

To generate the HTP package:

$    ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator  \
     -p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageHtp.xml \
     -o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP # this can be any path

The above command will produce the following artifacts: * ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage

Compiling QNN Custom Op Packages

The artifacts produced in the previous section will be directories containing partially complete skeleton source files and a makefile for generating QNN op package shared libraries. These skeleton files provide all the hooks that users can utilize to implement their custom operations.

For this tutorial, the generated examples should be replaced with an already completed example from the SDK. Using the completed source code, the generated packages can be compiled for the relevant targets.

Optionally, each one of the CPU, DSP and HTP packages can be generated and compiled using the provided setup script through the -g argument.

$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -g <cpu or dsp or htp>

Note

Environment variables must be correctly set on the command line to replicate the compilation steps. Please see the manual compilation instructions below for further clarification on backend-specific requirements.

Compiling for CPU on Linux

First, the generated example should be replaced with an already completed example from the SDK using the following command:

$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/CPU/Relu.cpp ./src/ops

The CPU op package can be compiled using the following commands:

$ export CXX=<path-to-clang++>/clang++
$ export ANDROID_NDK_ROOT=<path-to-ndk-build>
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage && make cpu

And the following artifacts are produced:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/x86_64-linux-clang/libReluOpPackage.so

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/aarch64-android/libReluOpPackage.so

Compiling for CPU on Windows

First, the generated example should be replaced with an already completed example from the SDK using the following command:

$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/CPU/Relu.cpp ./src/ops

Now, open Developer PowerShell for VS 2022.

The CPU op package can be compiled using the following commands:

Note

When compiling the CPU op package for Windows, the user may select either the x64 or arm64 architecture using the following flag when invoking cmake: -A [x64 | arm64].

To compile the CPU op package for Windows host, use the following command:

$ cd ${QNN_SDK_ROOT}\examples\Models\InceptionV3\InceptionV3OpPackage\CPU\ReluOpPackage
$ cmake -S . -B build -A x64
$ cd build
$ cmake --build . --config release

This will produce the ${QNN_SDK_ROOT}\examples\Models\InceptionV3\InceptionV3OpPackage\CPU\ReluOpPackage\build\Release\ReluOpPackage.dll for execution on Windows Host.

To compile the CPU op package for Windows device the steps are the same as above, except the -A x64 argument in the second line becomes -A arm64.

Compiling for DSP on Linux

First, the generated example should be replaced with an already completed example from the SDK using the following command:

$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/DSP/Relu.cpp ./src/ops
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/DSP/DspOps.hpp ./include

The DSP op package can be compiled using the following commands:

$ export X86_CXX=<path-to-clang++>
$ export HEXAGON_SDK_ROOT=<path-to-hexagon-sdk>
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage && make all

Please see Compiler Toolchains for the correct version for HEXAGON_SDK_ROOT and X86_CXX required by this hardware.

This would produce the following artifacts for DSP:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage/build/DSP/libQnnReluOpPackage.so

Compiling for HTP on Linux

First, the generated example should be replaced with an already completed example from the SDK using the following command:

$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/HTP/Relu.cpp ./src/ops

The HTP op package can be compiled using the following commands:

$ export X86_CXX=<path-to-clang++>
$ export HEXAGON_SDK_ROOT=<path-to-hexagon-sdk>
$ export QNN_INCLUDE=${QNN_SDK_ROOT}/include/QNN
$ export ANDROID_NDK_ROOT=<path-to-ndk-build>
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage && make all

Please see Compiler Toolchains for the correct version for HEXAGON_SDK_ROOT and X86_CXX required by this hardware.

This would produce the following artifacts for HTP:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/aarch64-android/libQnnReluOpPackage.so

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon-v68/libQnnReluOpPackage.so

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so

Please see Compiler Toolchains for the correct version for HEXAGON_SDK_ROOT and X86_CXX required by this hardware.

Model Conversion

After the model assets have been acquired the model can be converted to a series of invocations of QNN API and subsequently built for use by an application. The Relu operation in Inception V3 can be converted into a QNN model as a custom operation by providing the config as an input to any of the available QNN converters. This is be done using the \-\-op_package_config or -opc option

Note

A quantized model is needed for use on the HTP and DSP backends. See Model Quantization to generate a quantized model.

To convert the Inception V3 model use the qnn-tensorflow-converter:

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-tensorflow-converter \
  --input_network ${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb \
  --input_dim input 1,299,299,3 \
  --out_node InceptionV3/Predictions/Reshape_1 \
  --output_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.cpp \
  --op_package_config ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/config/ReluOpPackageCpu.xml

This will produce the following artifacts:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.cpp

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.bin

The artifacts include a .cpp file containing the sequence of API calls, and a .bin file containing the static data associated with the model.

All Relu Op nodes within the model should have the same package name as the name provided in the XML config. This is essential for tying the operation with the shared library that was produced in the previous sections. Users should note the contrast with other operations bearing the package name qti.aisw, which is the default in-built op package provided with the SDK. Additionally, although ReluOpPackageCpu.xml is used here, any of the other previously listed config files can also be substituted for conversion.

Note

If developing on Windows, please run the above in the WSL (x86) environment.

Model Quantization

To use a quantized model instead of a floating point model, we need to provide the CPU Op Package created above to the converter using the \-\-op_package_lib or -opl– option. Without this option, the default qti.aisw package will be used.

The command below converts and quantizes the model using the custom op config and library:

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-tensorflow-converter \
  --input_network ${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb \
  --input_dim input 1,299,299,3 \
  --out_node InceptionV3/Predictions/Reshape_1 \
  --output_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.cpp \
  --op_package_config ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/config/ReluOpPackageCpu.xml  \
  --input_list ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped/raw_list.txt \
  --op_package_lib ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/x86_64-linux-clang/libReluOpPackage.so:ReluOpPackageInterfaceProvider

This will produce the following artifacts:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.cpp

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.bin

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized_quantization_encodings.json

Note

When quantizing a model during conversion the input list must contain absolute path to input data.

Model Build

Once the model is converted it is built into a shared library with qnn-model-lib-generator:

Model Build on Linux Host

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator \
  -c ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.cpp \
  -b ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.bin \
  -o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs # This can be any path

This will produce the following artifacts:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/libInception_v3.so

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3.so

Note

By default libraries are built for all targets. To compile for a specific target, use the -t <target> option with qnn-model-lib-generator. Choices of <target> are aarch64-android and x86_64-linux-clang.

Optionally, the above steps can be completed with the provided setup script. To convert and build the model Inception V3 using the script run:

$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d -c -cu

This will produce the same artifacts as above.

To build the quantized model, the steps are the same as above:

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator \
  -c ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.cpp \
  -b ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.bin \
  -o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs # This can be any path

This will produce the following artifacts:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/libInception_v3_quantized.so

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so

Optionally, the above steps can be completed with the provided setup script. To convert, quantize, and build the model Inception V3 using the script run:

$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d -c -q --custom

Note

When both quantize and custom option are selected, the cpu op package will be generated and compiled even if \-\-generate_packages is not provided since the library is needed for accurate quantization.

This will produce the same artifacts as above.

Model Build on Windows Host

To build the model files into a DLL library on a Windows host, we will use: Developer PowerShell for VS 2022.

Make sure you setup the ${QNN_SDK_ROOT} environment variable with Environment Setup for Windows.

For Windows native/x86_64 PC developers

$ py -3 ${QNN_SDK_ROOT}\bin\x86_64-windows-msvc\qnn-model-lib-generator `
    -c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.cpp `
    -b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.bin `
    -o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib ` # this can be any path

For Windows on Snapdragon developers

$ py -3 ${QNN_SDK_ROOT}\bin\aarch64-windows-msvc\qnn-model-lib-generator `
    -c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.cpp `
    -b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.bin `
    -o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib ` # this can be any path

This will produce the following artifact:

  • ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\x64\Inception_v3.dll

  • ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\ARM64\Inception_v3.dll

Note

The model DLL library can be built for x64 or ARM64 platforms by specifying the desired platform.

To build the quantized model files into a DLL library on a Windows host, we will again use: Developer PowerShell for VS 2022.

Make sure you setup the ${QNN_SDK_ROOT} environment variable with Environment Setup for Windows.

For Windows native/x86_64 PC developers

$ py -3 ${QNN_SDK_ROOT}\bin\x86_64-windows-msvc\qnn-model-lib-generator `
    -c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.cpp `
    -b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.bin `
    -o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib `

For Windows on Snapdragon developers

$ py -3 ${QNN_SDK_ROOT}\bin\aarch64-windows-msvc\qnn-model-lib-generator `
    -c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.cpp `
    -b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.bin `
    -o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib `

This will produce the following artifact:

  • ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\x64\Inception_v3_quantized.dll

  • ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\ARM64\Inception_v3_quantized.dll

CPU Backend Execution

Execution on Linux Host

With the model library compiled, the model can be executed using qnn-net-run with the following:

$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \
              --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnCpu.so \
              --model model_libs/x86_64-linux-clang/libInception_v3.so \
              --input_list data/cropped/raw_list.txt \
              --op_packages InceptionV3OpPackage/CPU/ReluOpPackage/libs/x86_64-linux-clang/libReluOpPackage.so:ReluOpPackageInterfaceProvider

This will produce the results at:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output

To view the results use:

$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
                                                                               -o output/ \
                                                                               -l data/imagenet_slim_labels.txt

Execution on Android

Running the CPU Backend on an Android target is largely similar to running on the Linux x86 target.

First, create a directory for the example on device:

# make inception_v3 directory if necessary
$ adb shell "mkdir /data/local/tmp/inception_v3"

Now push the necessary libraries to device:

$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnCpu.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/aarch64-android/libReluOpPackage.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/*.so /data/local/tmp/inception_v3

Now push the input data and input lists to device:

$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3

Push the qnn-net-run tool:

$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3

Now set up the environment on device:

$ adb shell
$ cd /data/local/tmp/inception_v3
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3

Finally, use qnn-net-run with the following:

$ ./qnn-net-run --backend libQnnCpu.so --model libInception_v3.so --input_list target_raw_list.txt \
                --op_packages libReluOpPackage.so:ReluOpPackageInterfaceProvider

Outputs from the run will be located at the default ./output directory. Exit the device and view the results:

$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
                                                                               -o output_android/ \
                                                                               -l data/imagenet_slim_labels.txt

Execution on Windows Host

Please ensure the Model Build on Windows Host and Compiling for CPU on Windows sectios have been completed before proceeding.

First, create the following folder on the Windows host: ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.

Now, copy the necessary libraries and input data to: ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.

- ${QNN_SDK_ROOT}/lib/x86_64-windows-msvc/QnnCpu.dll
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_lib/x64/Inception_v3.dll (generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/build/Release/ReluOpPackage.dll (x64 version as generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt

With the model library compiled, qnn-net-run.exe can perform inference with the model using the following commands in Developer PowerShell for VS 2022:

$ cd ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package

$ ${QNN_SDK_ROOT}\bin\x86_64-windows-msvc\qnn-net-run.exe --backend .\QnnCpu.dll --model .\Inception_v3.dll --input_list .\target_raw_list.txt --op_packages .\ReluOpPackage.dll:ReluOpPackageInterfaceProvider

After the inference, we can check the classification result. By default, outputs will be located in the ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package\output directory.

Copy the following files and directories to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.

- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/imagenet_slim_labels
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py

To view the result:

$ py -3 .\show_inceptionv3_classifications.py `
     -i .\cropped\raw_list.txt `
     -o .\output `
     -l .\imagenet_slim_labels.txt

Execution on Windows Device

Please ensure the Model Build on Windows Host and Compiling for CPU on Windows sections have been completed before proceeding.

Copy the following files from the development host to the Windows device’s testing folder:

- ${QNN_SDK_ROOT}/bin/aarch64-windows-msvc/qnn-net-run.exe
- ${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnCpu.dll
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_lib/ARM64/Inception_v3.dll (generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/build/Release/ReluOpPackage.dll (arm64 version as generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt

Finally, connect to the Windows device and use qnn-net-run.exe with the following command in PowerShell:

$ .\qnn-net-run.exe --backend .\QnnCpu.dll `
                    --model .\Inception_v3.dll `
                    --input_list .\target_raw_list.txt `
                    --op_packages .\ReluOpPackage.dll:ReluOpPackageInterfaceProvider

This will produce the results at .\output.

To view the result:

First, create a directory ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package on the development host.

Copy the output folder from the Windows device back to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.

Also copy the following files and directories to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.

- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/imagenet_slim_labels
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py

On the development host, open Developer PowerShell for VS 2022 to view the result:

$ cd ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package
$ py -3 .\show_inceptionv3_classifications.py `
     -i .\cropped\raw_list.txt `
     -o .\output `
     -l .\imagenet_slim_labels.txt

DSP Backend Execution

Execution on Android

Running the DSP Backend on an Android target is largely similar to running the CPU and HTP backends on Android target. The remainder of this section will assume the target has a v66 DSP.

Similar to HTP backend, DSP backend also requires a quantized model. To generate a quantized model, see Model Quantization.

First, create a directory for the example on device:

# make inception_v3 directory if necessary
$ adb shell "mkdir /data/local/tmp/inception_v3"

Now push the necessary libraries to device:

$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnDsp.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v66/unsigned/libQnnDspV66Skel.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnDspV66Stub.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/libInception_v3_quantized.so  /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage/build/DSP/libQnnReluOpPackage.so /data/local/tmp/inception_v3

Now push the input data and input lists to device:

$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3

Push the qnn-net-run tool:

$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3

Now set up the environment on device:

$ adb shell
$ cd /data/local/tmp/inception_v3
$ export VENDOR_LIB=/vendor/lib/ # /vendor/lib64/ if aarch64
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3:/vendor/dsp/cdsp:$VENDOR_LIB
$ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3;/vendor/dsp/cdsp;/vendor/lib/rfsa/adsp;/system/lib/rfsa/adsp;/dsp"

Finally, use qnn-net-run with the following:

$ ./qnn-net-run --backend libQnnDsp.so --model libInception_v3_quantized.so --input_list target_raw_list.txt --output_dir output_android \
                --op_packages libQnnReluOpPackage.so

Outputs from the run will be located at the ./output_android directory. Exit the device and view the results:

$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
                                                                               -o output_android/ \
                                                                               -l data/imagenet_slim_labels.txt

HTP Backend Execution

Execution on Linux Host

The HTP backend can be exercised on Linux Host through the use of the HTP Emulation backend. With the model library compiled, the model can be executed using qnn-net-run with the following:

$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \
              --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \
              --model ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so \
              --input_list data/cropped/raw_list.txt \
              --op_packages ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider

Note

In order to use the HTP Emulation backend, a quantized model is required. For more information on quantization see Model Quantization.

This will produce the results at:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output

To view the results use:

$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
                                                                               -o output/ \
                                                                               -l data/imagenet_slim_labels.txt

Note

Running HTP emulation backend on Windows host is not supported.

Running HTP Backend on Android using offline prepared graph

To run the graph on the HTP backend, the aarch64 version of the OpPackage is needed in addition to the hexagon-v68 version. Running the HTP Backend with serialized context on an Android target is largely similar to running the CPU and DSP Backend on Android target.

Running the model on device using the HTP backend can be done with the generation of a serialized context. This serialized context can be initialized more efficiently by HTP compared to the original libqnn_model_8bit_quantized.so model. To generate the context, run:

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-context-binary-generator \
              --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \
              --model ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so \
              --op_packages ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider \
              --binary_file Inception_v3_quantized.serialized

This creates the context at:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin

First, create a directory for the example on device:

# make inception_v3 directory if necessary
$ adb shell "mkdir /data/local/tmp/inception_v3"

Now push the necessary libraries to device:

$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV68Stub.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/* /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon_v68/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Htp.so
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/aarch64-android/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Cpu.so
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin /data/local/tmp/inception_v3

Now push the input data and input lists to device:

$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3

Push the qnn-net-run tool:

$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3

Now set up the environment on device:

$ adb shell
$ cd /data/local/tmp/inception_v3
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
$ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3"

Finally, use qnn-net-run with the following:

$ ./qnn-net-run --backend libQnnHtp.so --input_list target_raw_list.txt --retrieve_context Inception_v3_quantized.serialized.bin
                --op_packages libQnnReluOpPackage_Cpu.so:ReluOpPackageInterfaceProvider:CPU,libQnnReluOpPackage_Htp.so:ReluOpPackageInterfaceProvider:HTP

In this case, two target variants of the op package are passed to qnn-net-run. The first, libQnnReluOpPackage_Cpu.so, is the ARM aarch64 build while the second, libQnnReluOpPackage_Htp.so is the hexagon v68 build

“:CPU” and “:HTP” are the optional target parameters specifying the target platforms on which the backend must register the op packages. The “CPU” target indicates that the op package is compiled for CPU (ARM aarch64). The “HTP” target indicates that the op package is compiled for HTP on-device.

Outputs from the run will be located at the default ./output directory. Exit the device and view the results:

$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
                                                                               -o output_android/ \
                                                                               -l data/imagenet_slim_labels.txt

Running HTP Backend on Android using on-device prepared graph

Running the HTP backend with ARM (CPU) Prepare on an Android target is largely similar to running HTP Backend with offline-prepared graph.

First, create a directory for the example on device:

# make oppackage if necessary
$ adb shell "mkdir /data/local/tmp/oppackage"
$ adb shell "mkdir /data/local/tmp/oppackage/HTP"

Now push the necessary libraries to device:

$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV68Stub.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/* /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/aarch64-android/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Cpu.so
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon_v68/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Htp.so

Now push the input data and input lists to device:

$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3

Push the qnn-net-run tool:

$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3

Now set up the environment on device:

$ adb shell
$ cd /data/local/tmp/inception_v3
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
$ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3"

Finally, use qnn-net-run with the following:

$ ./qnn-net-run --backend libQnnHtp.so --input_list target_raw_list.txt \
                 --model libInception_v3_quantized.so \
                 --op_packages libQnnReluOpPackage_Cpu.so:ReluOpPackageInterfaceProvider:CPU,libQnnReluOpPackage_Htp.so:ReluOpPackageInterfaceProvider:HTP

In this case, two target variants of the op package are passed to qnn-net-run. The first, libQnnReluOpPackage_Cpu.so, is the ARM aarch64 build while the second, libQnnReluOpPackage_Htp.so is the hexagon v68 build

“:CPU” and “:HTP” are the optional target parameters specifying the target platforms on which the backend must register the op packages. The “CPU” target indicates that the op package is compiled for CPU (ARM aarch64). The “HTP” target indicates that the op package is compiled for HTP on-device.

Outputs from the run will be located at the default ./output directory. Exit the device and view the results:

$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
                                                                               -o output_android/ \
                                                                               -l data/imagenet_slim_labels.txt

Running HTP Backend on Windows device using offline prepared graph

A model can be run using the HTP backend on Windows by generating a serialized context. This serialized context can be initialized more efficiently by HTP compared to the original qnn_model_8bit_quantized.dll model. To generate the context, run qnn-context-binary-generator in WSL:

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-context-binary-generator \
              --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \
              --model ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so \
              --op_packages ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider \
              --output_dir ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output \
              --binary_file Inception_v3_quantized.serialized

This creates the context at:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin

Copy the following files from the development host to the Windows device’s testing folder:

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon-v68/libQnnReluOpPackage.so

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt

  • ${QNN_SDK_ROOT}/bin/aarch64-windows-msvc/qnn-net-run.exe

  • ${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnHtp.dll

  • ${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnHtpV68Stub.dll

  • ${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so

Finally, connect to the Windows device and run qnn-net-run with the following command in "PowerShell":

$ .\qnn-net-run.exe --backend QnnHtp.dll `
                     --input_list target_raw_list.txt `
                     --retrieve_context Inception_v3_quantized.serialized.bin `
                     --op_packages libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider

This will produce the results at:

  • .\output

To view the result:

First, create a directory ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package on the Windows host.

Copy the output folder from the Windows device back to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.

Also copy the following files and directories from SDK to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/imagenet_slim_labels.txt

  • ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py

On the Windows host, open "Developer PowerShell for VS 2022" to view the result:

$ cd ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package
$ py -3 .\show_inceptionv3_classifications.py -i .\cropped\raw_list.txt `
                                                -o output `
                                                -l .\imagenet_slim_labels.txt

Note

Running an on-device, prepared graph with the HTP backend is not supported on Windows.