Tutorial: Utilizing Deep Learning Containers (DLCs) in Qualcomm® AI Engine Direct¶
This tutorial demonstrates how to use Deep Learning Containers (DLCs) in QNN. DLCs are Qualcomm’s serialized model format. Qualcomm Neural Processing SDK introduced the DLC serialization format to represent and store deep learning models. Qualcomm® AI Engine Direct now supports DLCs, to simplify cross product workflows with Qualcomm® Neural Processing SDK since developers can leverage model preparation and analysis tools from Qualcomm® Neural Processing SDK using the same model artifact. Qualcomm® AI Engine Direct can also now prepare and execute with large (> 2GB) models.
Utilizing DLCs in QNN requires model conversion tools from Qualcomm® Neural Processing SDK, and execution tools and libraries from Qualcomm® AI Engine Direct.
This tutorial uses Inception V3 as the source framework model and the qnn-net-run executable as
the example application. This tutorial covers execution on CPU, GPU, and HTP backends on host (CPU/HTP)
and client devices.
This tutorial is divided into the following sections:
Note
DLC in QNN is only supported on Linux x86 and Aarch64 platforms.
Note
DLC in QNN should be used with matching SNPE and QNN releases i.e. DLC from version 2.15 of SNPE should be used with version 2.15 of QNN.
Tutorial Setup¶
The tutorial assumes general setup instructions for both QNN and SNPE have been followed. In particular, conversion to DLCs with tools requires PYTHONPATH and SNPE_ROOT to be set appropriately.
Additionally, this tutorial requires the acquisition of the Inception V3 Tensorflow model file and
sample images. This is handled by the provided setup script setup_inceptionv3.py. The script is located at:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py
Usage is as follows:
usage: setup_inceptionv3.py [-h] -a ASSETS_DIR [-d] [-c] [-q]
Prepares the inception_v3 assets for tutorial examples.
required arguments:
-a ASSETS_DIR, --assets_dir ASSETS_DIR
directory containing the inception_v3 assets
optional arguments:
-d, --download Download inception_v3 assets to inception_v3 example
directory
-c, --convert_model Convert and compile model once acquired.
-q, --quantize_model Quantize the model during conversion. Only available
if --c or --convert_model option is chosen
Before using the script, please set the environment variable TENSORFLOW_HOME to point to the
location where TensorFlow package is installed. The script uses TensorFlow utilities like
optimize_for_inference.py, which are present in the TensorFlow installation directory.
Find the location of the TensorFlow package:
$ python3 -m pip show tensorflow
Set the
TENSORFLOW_HOMEenvironment variable using the installation location of the TensorFlow package (the location field from the output in step #1):$ export TENSORFLOW_HOME=<tensorflow-location>/tensorflow_core
Install the Inception V3 TensorFlow model and sample images using the
setup_inceptionv3.pyscript:$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d
This model file should now be populated at the following location:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb
This raw images should now be populated at the following location:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
Model Conversion¶
After acquiring the the model assets the model can be converted to a DLC using the Conversion Tools in the Qualcomm® Neural Processing SDK.
Note
A quantized model is needed for use on HTP and DSP backends. See Model Quantization to generate a quantized DLC.
Convert the Inception V3 model use the snpe-tensorflow-to-dlc tool.
$ ${SNPE_ROOT}/bin/x86_64-linux-clang/snpe-tensorflow-to-dlc \
--input_network ${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb \
--input_dim input 1,299,299,3 \
--out_node InceptionV3/Predictions/Reshape_1 \
--output_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc \
This produces the ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc DLC file.
The DLC contains the serialized model, network topology and the associated model data.
Model Quantization¶
The DLC can be quantized using the snpe-dlc-quantize tool. Example usage is below:
$ ${SNPE_ROOT}/bin/x86_64-linux-clang/snpe-dlc-quantize \
--input_dlc ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc \
--input_list ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped/raw_list.txt \
--output_dlc ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc \
This will produce the following artifacts:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc
Note
When quantizing a model the input list must contain the absolute path to the input data.
Execution requires the produced DLC and the provided utility library libQnnModelDlc.so. This library
extends the QNN Model API to compose a QNN graph and return its handle from a provided DLC path.
ModelError_t QnnModel_composeGraphsFromDlc(Qnn_BackendHandle_t backendHandle,
QNN_INTERFACE_VER_TYPE interface,
Qnn_ContextHandle_t contextHandle,
const GraphConfigInfo_t **graphsConfigInfo,
const char *dlcPath,
const uint32_t numGraphsConfigInfo,
GraphInfoPtr_t **graphsInfo,
uint32_t *numGraphsInfo,
bool debug,
QnnLog_Callback_t logCallback,
QnnLog_Level_t maxLogLevel)
This is identical to the QnnGraph_ComposeGraphs API with the addition of the dlcPath
input argument. The returned QNN graph handle can then be finalized and executed.
The following section demonstrates execution with a DLC.
CPU Backend Execution¶
Execute on a Linux host¶
Execute the model using
qnn-net-runwith thelibQnnModelDlc.soutility library as the--modelargument and the Inception_v3.dlc as the--dlc_pathargument.$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \ --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnCpu.so \ --model ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so \ --dlc_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc \ --input_list data/cropped/raw_list.txt
The results will be located at
${QNN_SDK_ROOT}/examples/Models/InceptionV3/output.View the results.
$ python ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output/ \ -l data/imagenet_slim_labels.txt
Execute on Android¶
Running the CPU backend on an Android target is similar to running on a Linux x86 target.
Create a directory for the example on the Android device.
$ adb shell "mkdir /data/local/tmp/inception_v3"
Push the necessary libraries and DLC to the device.
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnCpu.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnModelDlc.so /data/local/tmp/inception_v3Push the input data and lists to the device.
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
Push the
qnn-net-runtool to the device.$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
Set up the device environment.
$ adb shell $ cd /data/local/tmp/inception_v3 $ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
Run
qnn-net-runwith the following arguments.$ ./qnn-net-run --backend libQnnCpu.so --model libQnnModelDlc.so --dlc_path Inception_v3.dlc --input_list target_raw_list.txt
Outputs from the run will be located at the default ./output directory.
Exit the device and view the results.
$ exit $ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ adb pull /data/local/tmp/inception_v3/output output_android $ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output_android/ \ -l data/imagenet_slim_labels.txt
GPU Backend Execution¶
Execute on Android¶
Running the GPU backend on an Android target is similar to running the CPU backend on an Android target.
Create a directory for the example on the Android device.
$ adb shell "mkdir /data/local/tmp/inception_v3"
Push the necessary libraries and DLC to the device.
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGpu.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so /data/local/tmp/inception_v3Push the input data and lists to the device.
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
Push the
qnn-net-runtool to the device.$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
Set up the device environment.
$ adb shell $ cd /data/local/tmp/inception_v3 $ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
Run
qnn-net-runwith the following arguments.$ ./qnn-net-run --backend libQnnGpu.so --model libQnnModelDlc.so --dlc_path Inception_v3.dlc --input_list target_raw_list.txt
Outputs from the run will be located at the default ./output directory.
Exit the device and view the results.
$ exit $ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ adb pull /data/local/tmp/inception_v3/output output_android $ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output_android/ \ -l data/imagenet_slim_labels.txt
Execute on Windows¶
Please ensure the `Model Build on Windows Host`_ and `Compiling for CPU on Windows`_ sections have been completed before proceeding.
First, create the following folder on the Windows device: C:\dlc_qnn_test.
Now, copy the following files from the development host to the Windows device’s testing folder: C:\dlc_qnn_test
- ${QNN_SDK_ROOT}/bin/aarch64-windows-msvc/qnn-net-run.exe
- ${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnGpu.dll
- ${QNN_SDK_ROOT}/lib/aarch64-windows-msvc/QnnModelDlc.dll
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt
Finally, connect to the Windows device and use qnn-net-run.exe with the following command in PowerShell:
$ .\qnn-net-run.exe --backend .\QnnGpu.dll `
--model .\QnnModelDlc.dll `
--dlc_path .\Inception_v3.dll `
--input_list .\target_raw_list.txt
This will produce the results at .\output.
To view the result:
First, create a directory ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package on the development host.
Copy the output folder from the Windows device back to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
Also copy the following files and directories to ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package.
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/imagenet_slim_labels
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py
On the development host, open Developer PowerShell for VS 2022 to view the result:
$ cd ${QNN_SDK_ROOT}\tmp\qnn_inception_v3_test_package
$ py -3 .\show_inceptionv3_classifications.py `
-i .\cropped\raw_list.txt `
-o .\output `
-l .\imagenet_slim_labels.txt
HTP Backend Execution¶
Execute on a Linux host¶
Note
The HTP backend can be exercised on a Linux host using the HTP emulation backend.
Execute the model using
qnn-net-runwith thelibQnnModelDlc.soutility library as the--modelargument, and the Inception_v3_quantized.dlc as the--dlc_pathargument.$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \ --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \ --model ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so \ --dlc_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc \ --input_list data/cropped/raw_list.txt
Note
The HTP Emulation backend requires a quantized model. For more information on quantization see Model Quantization.
The results will be located at
${QNN_SDK_ROOT}/examples/Models/InceptionV3/output.View the results.
$ python ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output/ \ -l data/imagenet_slim_labels.txt
Execute on Android¶
Running the HTP backend on an Android target is similar to running the CPU and GPU backends on an Android target, except the HTP backend requires a quantized model and a user-generated serialized context. For more information on quantization see Model Quantization.
Generate a serialized context from a DLC by running
qnn-context-binary-generatorwith libQnnModelDlc.so as the--modelargument and the quantized DLC as the--dlc_pathargument.$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-context-binary-generator \ --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \ --model ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so \ --dlc_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc \ --binary_file Inception_v3_quantized.serialized
The context will be created at
./output/Inception_v3_quantized.serialized.bin.Create a directory for the example on the Android device.
$ adb shell "mkdir /data/local/tmp/inception_v3"
Push the necessary libraries and DLC to the device.
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV68Stub.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin /data/local/tmp/inception_v3
Note
This section demonstrates HTP execution on Android with offline prepared graph steps. To execute an on-device (online) prepared graph, push an on-device prepare library and quantized DLC.
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc /data/local/tmp/inception_v3
Push the input data and lists to the device.
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
Push the
qnn-net-runtool to the device.$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
Set up the device environment.
$ adb shell $ cd /data/local/tmp/inception_v3 $ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3 $ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3"
Run
qnn-net-runwith the following arguments.$ ./qnn-net-run --backend libQnnHtp.so --input_list target_raw_list.txt --retrieve_context Inception_v3_quantized.serialized.bin
Outputs from the run will be located at the default ./output directory.
Exit the device and view the results.
$ exit $ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ adb pull /data/local/tmp/inception_v3/output output_android $ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output_android/ \ -l data/imagenet_slim_labels.txt