Tutorial: Converting and executing a CNN model with custom operations¶
The following tutorial will demonstrate the end to end usage of QNN Tools and the QNN API in the context of user-created custom operations. By Custom Operations, we mean operations that are defined in an operation definition configuration file in tandem with a user defined QNN op package library for execution.
This process begins with a trained source framework model containing such operations, along
with a config file filled with accompanying definitions, to be converted and built into a series of QNN API
calls using one of the available QNN Converters. Additionally, the same config file will be used to create an op package library skeleton
via the qnn-op-package-generator . The completed skeleton and QNN model source files become shared
libraries compiled for a specific target which can then execute on a particular backend.
The tutorial will use Inception V3 as the source framework model and the qnn-net-run executable as
the example application. The execution will show usage on the CPU, DSP and HTP backends on both host (for CPU
and HTP) and device.
The sections of the tutorial are as follows:
Note
If developing on Windows, the following sections must be executed in the WSL (x86) environment: Tutorial Setup, Creating a QNN Custom Op Package, and Model Conversion. The Model Build and Model Execution sections, however, should be executed on Windows natively. Please refer to Integration Workflow on Windows to see more details of the workflow.
Tutorial Setup¶
The tutorial assumes general setup instructions have been followed at Setup.
Additionally, this tutorial requires the acquisition of the Inception V3 Tensorflow model file and
sample images. This is handled by the provided setup script setup_inceptionv3.py. The script is located at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py
Usage is as follows:
usage: setup_inceptionv3.py [-h] -a ASSETS_DIR [-c] [-cu] [-d]
[-g {cpu,dsp,htp}] [-q]
Prepares the inception_v3 assets for tutorial examples.
required arguments:
-a ASSETS_DIR, --assets_dir ASSETS_DIR
directory containing the inception_v3 assets
optional arguments:
-c, --convert_model Convert and compile model once acquired.
-cu, --custom Convert the model using Relu as a custom operation.
Only available if --c or --convert_model option is
chosen
-d, --download Download inception_v3 assets to inception_v3 example
directory
-g {cpu,dsp,htp}, --generate_packages {cpu,dsp,htp}
Generate and compile custom op packages for HTP, CPU
and DSP
-q, --quantize_model Quantize the model during conversion. Only available
if --c or --convert_model option is chosen
To run the script use:
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d
This will populate the model file at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb
And the raw images at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
Note
If developing on Windows, please run the above steps on WSL (x86).
Custom Op XML Config Creation¶
A custom operation is defined in QNN as an XML file containing a description of its inputs, outputs and attributes according to an XML schema outlined at XML OpDef Schema Breakdown.
Instructions on writing configs can be found at XML OpDef Schema Breakdown with examples shown at Example XML Op Def Configs.
In the SDK, users can find working examples at:
${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator
Given a custom op config file defining a operation and a model containing it, any of the available
qnn converters will produce the model artifacts. Similarly, the config file can be provided to
the qnn-op-package-generator tool to produce a QNN op package skeleton.
In this tutorial, the following XML config files can be used with the Inceptionv3 model obtained above:
${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageHtp.xml
${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageCpu.xml
${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageDsp.xml
Creating a QNN Custom Op Package¶
Creating a custom op package comprises of package generation, implementation and compilation into shared libraries. Although users can create their own packages manually, we highly recommend this workflow to avoid unexpected errors. We will cover the steps in the following sections.
Generating a Custom Op Package Skeleton¶
For the following section, it is assumed that setup instructions have been run and the
qnn-op-package-generator tool can be used. In this tutorial, packages can be generated for the
CPU, DSP and HTP backends respectively.
Note
Packages for CPU backends can be generated on a Windows host. Details are provided below.
To generate the CPU package on Linux, run:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator \
-p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageCpu.xml \
-o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU # this can be any path
The above command will produce the following artifacts:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage
To generate the CPU package on Windows, open WSL (x86) and run:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator \
-p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageCpu.xml \
-o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU \
--gen_cmakelists
The above command will produce the following artifacts:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage
To generate the DSP package:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator \
-p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageDsp.xml \
-o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP # this can be any path
The above command will produce the following artifacts:
* ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage
To generate the HTP package:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-op-package-generator \
-p ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/ReluOpPackageHtp.xml \
-o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP # this can be any path
The above command will produce the following artifacts:
* ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage
Compiling QNN Custom Op Packages¶
The artifacts produced in the previous section will be directories containing partially complete skeleton source files and a makefile for generating QNN op package shared libraries. These skeleton files provide all the hooks that users can utilize to implement their custom operations.
For this tutorial, the generated examples should be replaced with an already completed example from the SDK. Using the completed source code, the generated packages can be compiled for the relevant targets.
Optionally, each one of the CPU, DSP and HTP packages can be generated and compiled using the provided setup script through the -g argument.
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -g <cpu or dsp or htp>
Note
Environment variables must be correctly set on the command line to replicate the compilation steps. Please see the manual compilation instructions below for further clarification on backend-specific requirements.
Compiling for CPU on Linux¶
First, the generated example should be replaced with an already completed example from the SDK using the following command:
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/CPU/Relu.cpp ./src/ops
The CPU op package can be compiled using the following commands:
$ export CXX=<path-to-clang++>/clang++
$ export ANDROID_NDK_ROOT=<path-to-ndk-build>
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage && make cpu
And the following artifacts are produced:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/x86_64-linux-clang/libReluOpPackage.so${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/aarch64-android/libReluOpPackage.so
Compiling for CPU on Windows¶
First, the generated example should be replaced with an already completed example from the SDK using the following command:
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/CPU/Relu.cpp ./src/ops
Now, open Developer PowerShell for VS 2022.
The CPU op package can be compiled using the following commands:
Note
When compiling the CPU op package for Windows, the user may select either the x64 or arm64 architecture using the following flag when invoking cmake: -A [x64 | arm64].
To compile the CPU op package for Windows host, use the following command:
$ cd ${QNN_SDK_ROOT}\examples\Models\InceptionV3\InceptionV3OpPackage\CPU\ReluOpPackage
$ cmake -S . -B build -A x64
$ cd build
$ cmake --build . --config release
This will produce the ${QNN_SDK_ROOT}\examples\Models\InceptionV3\InceptionV3OpPackage\CPU\ReluOpPackage\build\Release\ReluOpPackage.dll for execution on Windows Host.
To compile the CPU op package for Windows device the steps are the same as above, except the -A x64 argument in the second line becomes -A arm64.
Compiling for DSP on Linux¶
First, the generated example should be replaced with an already completed example from the SDK using the following command:
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/DSP/Relu.cpp ./src/ops
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/DSP/DspOps.hpp ./include
The DSP op package can be compiled using the following commands:
$ export X86_CXX=<path-to-clang++>
$ export HEXAGON_SDK_ROOT=<path-to-hexagon-sdk>
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage && make all
Please see Compiler Toolchains for the correct version for HEXAGON_SDK_ROOT and X86_CXX required by this hardware.
This would produce the following artifacts for DSP:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage/build/DSP/libQnnReluOpPackage.so
Compiling for HTP on Linux¶
First, the generated example should be replaced with an already completed example from the SDK using the following command:
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage
$ cp ${QNN_SDK_ROOT}/examples/QNN/OpPackageGenerator/generated/HTP/Relu.cpp ./src/ops
The HTP op package can be compiled using the following commands:
$ export X86_CXX=<path-to-clang++>
$ export HEXAGON_SDK_ROOT=<path-to-hexagon-sdk>
$ export QNN_INCLUDE=${QNN_SDK_ROOT}/include/QNN
$ export ANDROID_NDK_ROOT=<path-to-ndk-build>
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage && make all
Please see Compiler Toolchains for the correct version for HEXAGON_SDK_ROOT and X86_CXX required by this hardware.
This would produce the following artifacts for HTP:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/aarch64-android/libQnnReluOpPackage.so${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon-v68/libQnnReluOpPackage.so${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so
Please see Compiler Toolchains for the correct version for HEXAGON_SDK_ROOT and X86_CXX required by this hardware.
Model Conversion¶
After the model assets have been acquired the model can be converted to a series of invocations of QNN API and subsequently built for use by an application. The Relu operation in Inception V3 can be converted into a QNN model as a custom operation by providing the config as an input to any of the available QNN converters. This is be done using the \-\-op_package_config or -opc option
Note
A quantized model is needed for use on the HTP and DSP backends. See Model Quantization to generate a quantized model.
To convert the Inception V3 model use the qnn-tensorflow-converter:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-tensorflow-converter \
--input_network ${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb \
--input_dim input 1,299,299,3 \
--out_node InceptionV3/Predictions/Reshape_1 \
--output_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.cpp \
--op_package_config ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/config/ReluOpPackageCpu.xml
This will produce the following artifacts:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.cpp${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.bin
The artifacts include a .cpp file containing the sequence of API calls, and a .bin file containing the static data associated with the model.
All Relu Op nodes within the model should have the same package name as the name provided in the XML config. This is essential for tying the operation with the shared library that was produced in the previous sections. Users should note the contrast with other operations bearing the package name qti.aisw, which is the default in-built op package provided with the SDK. Additionally, although ReluOpPackageCpu.xml is used here, any of the other previously listed config files can also be substituted for conversion.
Note
If developing on Windows, please run the above in the WSL (x86) environment.
Model Quantization¶
To use a quantized model instead of a floating point model, we need to provide the CPU Op Package created above to the converter using the \-\-op_package_lib or -opl– option. Without this option, the default qti.aisw package will be used.
The command below converts and quantizes the model using the custom op config and library:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-tensorflow-converter \
--input_network ${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb \
--input_dim input 1,299,299,3 \
--out_node InceptionV3/Predictions/Reshape_1 \
--output_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.cpp \
--op_package_config ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/config/ReluOpPackageCpu.xml \
--input_list ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped/raw_list.txt \
--op_package_lib ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/x86_64-linux-clang/libReluOpPackage.so:ReluOpPackageInterfaceProvider
This will produce the following artifacts:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.cpp${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.bin${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized_quantization_encodings.json
Note
When quantizing a model during conversion the input list must contain absolute path to input data.
Model Build¶
Once the model is converted it is built into a shared library with qnn-model-lib-generator:
Model Build on Linux Host¶
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator \
-c ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.cpp \
-b ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.bin \
-o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs # This can be any path
This will produce the following artifacts:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/libInception_v3.so${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3.so
Note
By default libraries are built for all targets. To compile for a specific target, use the -t <target> option with qnn-model-lib-generator. Choices of <target> are aarch64-android and x86_64-linux-clang.
Optionally, the above steps can be completed with the provided setup script. To convert and build the model Inception V3 using the script run:
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d -c -cu
This will produce the same artifacts as above.
To build the quantized model, the steps are the same as above:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator \
-c ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.cpp \
-b ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.bin \
-o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs # This can be any path
This will produce the following artifacts:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/libInception_v3_quantized.so${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so
Optionally, the above steps can be completed with the provided setup script. To convert, quantize, and build the model Inception V3 using the script run:
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d -c -q --custom
Note
When both quantize and custom option are selected, the cpu op package will be generated and compiled even if \-\-generate_packages is not provided since the library is needed for accurate quantization.
This will produce the same artifacts as above.
Model Build on Windows Host¶
To build the model files into a DLL library on a Windows host, we will use: Developer PowerShell for VS 2022.
Make sure you setup the ${QNN_SDK_ROOT} environment variable with Environment Setup for Windows.
For Windows native/x86_64 PC developers
$ py -3 ${QNN_SDK_ROOT}\bin\x86_64-windows-msvc\qnn-model-lib-generator `
-c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.cpp `
-b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.bin `
-o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib ` # this can be any path
For Windows on Snapdragon developers
$ py -3 ${QNN_SDK_ROOT}\bin\aarch64-windows-msvc\qnn-model-lib-generator `
-c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.cpp `
-b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3.bin `
-o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib ` # this can be any path
This will produce the following artifact:
${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\x64\Inception_v3.dll${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\ARM64\Inception_v3.dll
Note
The model DLL library can be built for x64 or ARM64 platforms by specifying the desired platform.
To build the quantized model files into a DLL library on a Windows host, we will again use: Developer PowerShell for VS 2022.
Make sure you setup the ${QNN_SDK_ROOT} environment variable with Environment Setup for Windows.
For Windows native/x86_64 PC developers
$ py -3 ${QNN_SDK_ROOT}\bin\x86_64-windows-msvc\qnn-model-lib-generator `
-c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.cpp `
-b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.bin `
-o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib `
For Windows on Snapdragon developers
$ py -3 ${QNN_SDK_ROOT}\bin\aarch64-windows-msvc\qnn-model-lib-generator `
-c ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.cpp `
-b ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model\Inception_v3_quantized.bin `
-o ${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib `
This will produce the following artifact:
${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\x64\Inception_v3_quantized.dll${QNN_SDK_ROOT}\examples\Models\InceptionV3\model_lib\ARM64\Inception_v3_quantized.dll
CPU Backend Execution¶
Execution on Linux Host¶
With the model library compiled, the model can be executed using qnn-net-run with the following:
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \
--backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnCpu.so \
--model model_libs/x86_64-linux-clang/libInception_v3.so \
--input_list data/cropped/raw_list.txt \
--op_packages InceptionV3OpPackage/CPU/ReluOpPackage/libs/x86_64-linux-clang/libReluOpPackage.so:ReluOpPackageInterfaceProvider
This will produce the results at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/output
To view the results use:
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
-o output/ \
-l data/imagenet_slim_labels.txt
Execution on Android¶
Running the CPU Backend on an Android target is largely similar to running on the Linux x86 target.
First, create a directory for the example on device:
# make inception_v3 directory if necessary
$ adb shell "mkdir /data/local/tmp/inception_v3"
Now push the necessary libraries to device:
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnCpu.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/libs/aarch64-android/libReluOpPackage.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/*.so /data/local/tmp/inception_v3
Now push the input data and input lists to device:
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
Push the qnn-net-run tool:
$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
Now set up the environment on device:
$ adb shell
$ cd /data/local/tmp/inception_v3
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
Finally, use qnn-net-run with the following:
$ ./qnn-net-run --backend libQnnCpu.so --model libInception_v3.so --input_list target_raw_list.txt \
--op_packages libReluOpPackage.so:ReluOpPackageInterfaceProvider
Outputs from the run will be located at the default ./output directory. Exit the device and view the results:
$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
-o output_android/ \
-l data/imagenet_slim_labels.txt
Execution on Windows Host¶
Please ensure the Model Build on Windows Host and Compiling for CPU on Windows sectios have been completed before proceeding.
First, create the following folder on the Windows host: ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
Now, copy the necessary libraries and input data to: ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
- ${QNN_SDK_ROOT}/lib/x86_64-windows-msvc/QnnCpu.dll
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_lib/x64/Inception_v3.dll (generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/build/Release/ReluOpPackage.dll (x64 version as generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt
With the model library compiled, qnn-net-run.exe can perform inference with the model using the following commands in Developer PowerShell for VS 2022:
$ cd ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package
$ ${QNN_SDK_ROOT}\bin\x86_64-windows-msvc\qnn-net-run.exe --backend .\QnnCpu.dll --model .\Inception_v3.dll --input_list .\target_raw_list.txt --op_packages .\ReluOpPackage.dll:ReluOpPackageInterfaceProvider
After the inference, we can check the classification result. By default, outputs will be located in the ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package\output directory.
Copy the following files and directories to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/imagenet_slim_labels
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py
To view the result:
$ py -3 .\show_inceptionv3_classifications.py `
-i .\cropped\raw_list.txt `
-o .\output `
-l .\imagenet_slim_labels.txt
Execution on Windows Device¶
Please ensure the Model Build on Windows Host and Compiling for CPU on Windows sections have been completed before proceeding.
Copy the following files from the development host to the Windows device’s testing folder:
- ${QNN_SDK_ROOT}/bin/aarch64-windows-msvc/qnn-net-run.exe
- ${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnCpu.dll
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_lib/ARM64/Inception_v3.dll (generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/CPU/ReluOpPackage/build/Release/ReluOpPackage.dll (arm64 version as generated above)
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt
Finally, connect to the Windows device and use qnn-net-run.exe with the following command in PowerShell:
$ .\qnn-net-run.exe --backend .\QnnCpu.dll `
--model .\Inception_v3.dll `
--input_list .\target_raw_list.txt `
--op_packages .\ReluOpPackage.dll:ReluOpPackageInterfaceProvider
This will produce the results at .\output.
To view the result:
First, create a directory ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package on the development host.
Copy the output folder from the Windows device back to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
Also copy the following files and directories to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/imagenet_slim_labels
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py
On the development host, open Developer PowerShell for VS 2022 to view the result:
$ cd ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package
$ py -3 .\show_inceptionv3_classifications.py `
-i .\cropped\raw_list.txt `
-o .\output `
-l .\imagenet_slim_labels.txt
DSP Backend Execution¶
Execution on Android¶
Running the DSP Backend on an Android target is largely similar to running the CPU and HTP backends on Android target. The remainder of this section will assume the target has a v66 DSP.
Similar to HTP backend, DSP backend also requires a quantized model. To generate a quantized model, see Model Quantization.
First, create a directory for the example on device:
# make inception_v3 directory if necessary
$ adb shell "mkdir /data/local/tmp/inception_v3"
Now push the necessary libraries to device:
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnDsp.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v66/unsigned/libQnnDspV66Skel.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnDspV66Stub.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/libInception_v3_quantized.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/DSP/ReluOpPackage/build/DSP/libQnnReluOpPackage.so /data/local/tmp/inception_v3
Now push the input data and input lists to device:
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
Push the qnn-net-run tool:
$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
Now set up the environment on device:
$ adb shell
$ cd /data/local/tmp/inception_v3
$ export VENDOR_LIB=/vendor/lib/ # /vendor/lib64/ if aarch64
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3:/vendor/dsp/cdsp:$VENDOR_LIB
$ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3;/vendor/dsp/cdsp;/vendor/lib/rfsa/adsp;/system/lib/rfsa/adsp;/dsp"
Finally, use qnn-net-run with the following:
$ ./qnn-net-run --backend libQnnDsp.so --model libInception_v3_quantized.so --input_list target_raw_list.txt --output_dir output_android \
--op_packages libQnnReluOpPackage.so
Outputs from the run will be located at the ./output_android directory. Exit the device and view the results:
$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
-o output_android/ \
-l data/imagenet_slim_labels.txt
HTP Backend Execution¶
Execution on Linux Host¶
The HTP backend can be exercised on Linux Host through the use of the HTP Emulation backend. With
the model library compiled, the model can be executed using qnn-net-run with the following:
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \
--backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \
--model ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so \
--input_list data/cropped/raw_list.txt \
--op_packages ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider
Note
In order to use the HTP Emulation backend, a quantized model is required. For more information on quantization see Model Quantization.
This will produce the results at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/output
To view the results use:
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
-o output/ \
-l data/imagenet_slim_labels.txt
Note
Running HTP emulation backend on Windows host is not supported.
Running HTP Backend on Android using offline prepared graph¶
To run the graph on the HTP backend, the aarch64 version of the OpPackage is needed in addition to the hexagon-v68 version. Running the HTP Backend with serialized context on an Android target is largely similar to running the CPU and DSP Backend on Android target.
Running the model on device using the HTP backend can be done with the generation of a serialized context. This serialized context can be initialized more efficiently by HTP compared to the original libqnn_model_8bit_quantized.so model. To generate the context, run:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-context-binary-generator \
--backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \
--model ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so \
--op_packages ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider \
--binary_file Inception_v3_quantized.serialized
This creates the context at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin
First, create a directory for the example on device:
# make inception_v3 directory if necessary
$ adb shell "mkdir /data/local/tmp/inception_v3"
Now push the necessary libraries to device:
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV68Stub.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/* /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon_v68/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Htp.so
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/aarch64-android/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Cpu.so
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin /data/local/tmp/inception_v3
Now push the input data and input lists to device:
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
Push the qnn-net-run tool:
$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
Now set up the environment on device:
$ adb shell
$ cd /data/local/tmp/inception_v3
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
$ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3"
Finally, use qnn-net-run with the following:
$ ./qnn-net-run --backend libQnnHtp.so --input_list target_raw_list.txt --retrieve_context Inception_v3_quantized.serialized.bin
--op_packages libQnnReluOpPackage_Cpu.so:ReluOpPackageInterfaceProvider:CPU,libQnnReluOpPackage_Htp.so:ReluOpPackageInterfaceProvider:HTP
In this case, two target variants of the op package are passed to qnn-net-run. The first, libQnnReluOpPackage_Cpu.so, is the ARM aarch64 build while the second, libQnnReluOpPackage_Htp.so is the hexagon v68 build
“:CPU” and “:HTP” are the optional target parameters specifying the target platforms on which the backend must register the op packages. The “CPU” target indicates that the op package is compiled for CPU (ARM aarch64). The “HTP” target indicates that the op package is compiled for HTP on-device.
Outputs from the run will be located at the default ./output directory. Exit the device and view the results:
$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
-o output_android/ \
-l data/imagenet_slim_labels.txt
Running HTP Backend on Android using on-device prepared graph¶
Running the HTP backend with ARM (CPU) Prepare on an Android target is largely similar to running HTP Backend with offline-prepared graph.
First, create a directory for the example on device:
# make oppackage if necessary
$ adb shell "mkdir /data/local/tmp/oppackage"
$ adb shell "mkdir /data/local/tmp/oppackage/HTP"
Now push the necessary libraries to device:
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV68Stub.so /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-android/* /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/aarch64-android/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Cpu.so
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon_v68/libQnnReluOpPackage.so /data/local/tmp/inception_v3/libQnnReluOpPackage_Htp.so
Now push the input data and input lists to device:
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
Push the qnn-net-run tool:
$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
Now set up the environment on device:
$ adb shell
$ cd /data/local/tmp/inception_v3
$ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
$ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3"
Finally, use qnn-net-run with the following:
$ ./qnn-net-run --backend libQnnHtp.so --input_list target_raw_list.txt \
--model libInception_v3_quantized.so \
--op_packages libQnnReluOpPackage_Cpu.so:ReluOpPackageInterfaceProvider:CPU,libQnnReluOpPackage_Htp.so:ReluOpPackageInterfaceProvider:HTP
In this case, two target variants of the op package are passed to qnn-net-run. The first, libQnnReluOpPackage_Cpu.so, is the ARM aarch64 build while the second, libQnnReluOpPackage_Htp.so is the hexagon v68 build
“:CPU” and “:HTP” are the optional target parameters specifying the target platforms on which the backend must register the op packages. The “CPU” target indicates that the op package is compiled for CPU (ARM aarch64). The “HTP” target indicates that the op package is compiled for HTP on-device.
Outputs from the run will be located at the default ./output directory. Exit the device and view the results:
$ exit
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
$ adb pull /data/local/tmp/inception_v3/output output_android
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \
-o output_android/ \
-l data/imagenet_slim_labels.txt
Running HTP Backend on Windows device using offline prepared graph¶
A model can be run using the HTP backend on Windows by generating a serialized context.
This serialized context can be initialized more efficiently by HTP compared to
the original qnn_model_8bit_quantized.dll model.
To generate the context, run qnn-context-binary-generator in WSL:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-context-binary-generator \
--backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \
--model ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/x86_64-linux-clang/libInception_v3_quantized.so \
--op_packages ${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/x86_64-linux-clang/libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider \
--output_dir ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output \
--binary_file Inception_v3_quantized.serialized
This creates the context at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin
Copy the following files from the development host to the Windows device’s testing folder:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/InceptionV3OpPackage/HTP/ReluOpPackage/build/hexagon-v68/libQnnReluOpPackage.so
${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin
${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt
${QNN_SDK_ROOT}/bin/aarch64-windows-msvc/qnn-net-run.exe
${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnHtp.dll
${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnHtpV68Stub.dll
${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so
Finally, connect to the Windows device and run qnn-net-run with the following command in "PowerShell":
$ .\qnn-net-run.exe --backend QnnHtp.dll `
--input_list target_raw_list.txt `
--retrieve_context Inception_v3_quantized.serialized.bin `
--op_packages libQnnReluOpPackage.so:ReluOpPackageInterfaceProvider
This will produce the results at:
.\output
To view the result:
First, create a directory ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package on the Windows host.
Copy the output folder from the Windows device back to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
Also copy the following files and directories from SDK to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/imagenet_slim_labels.txt
${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py
On the Windows host, open "Developer PowerShell for VS 2022" to view the result:
$ cd ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package
$ py -3 .\show_inceptionv3_classifications.py -i .\cropped\raw_list.txt `
-o output `
-l .\imagenet_slim_labels.txt
Note
Running an on-device, prepared graph with the HTP backend is not supported on Windows.