Tutorial - Using TFLite Delegate With a C/C++ Application¶
This page will outline the setup required for using the Qualcomm® AI Engine Direct
Delegate. Currently, only C APIs are provided. Aarch64-Android is supported.
For other platforms or architectures, please check libraries under lib directory.
Prerequisites¶
Qualcomm® AI Engine Direct libraries, which are in Qualcomm® AI Engine Direct SDK,
libdirectory. We use $QNN_SDK_ROOT for the root of unzipped Qualcomm® AI Engine Direct SDK.The TensorFlow Lite Android C API integrated into a C/C++ Android application. Check release notes for the supported TensorFlow version.
A Qualcomm device with Android Debug Bridge (ADB) connected.
Qualcomm® AI Engine Direct Delegate artifacts.
C/C++ Application¶
There are currently two methods for integrating the Qualcomm® AI Engine Direct Delegate into a C/C++ application.
Using the Qualcomm® AI Engine Direct Delegate Interface.
Using the TFLite External Delegate Interface.
Qualcomm® AI Engine Direct Delegate Interface¶
The Qualcomm® AI Engine Direct Delegate Interface is the fastest way to integrate the delegate into an existing application that uses TFLite. To use the Qualcomm® AI Engine Direct Delegate interface, the C/C++ application must include the QnnTFLiteDelegate.h file and dynamically link/load the libQnnTFLiteDelegate.so library.
The following code snippet shows how an application can instantiate the delegate and perform inference:
#include "QNN/TFLiteDelegate/QnnTFLiteDelegate.h"
// Setup interpreter with .tflite model.
// Create QNN Delegate options structure.
TfLiteQnnDelegateOptions options = TfLiteQnnDelegateOptionsDefault();
// Set the mandatory backend_type option. All other options have default values.
options.backend_type = kHtpBackend;
// Instantiate delegate. Must not be freed until interpreter is freed.
TfLiteDelegate* delegate = TfLiteQnnDelegateCreate(&options);
// Register QNN Delegate with TfLite interpreter to automatically delegate nodes.
interpreter->ModifyGraphWithDelegate(delegate);
// Perform inference with interpreter as usual.
// Delete delegate after interpreter no longer needed.
TfLiteQnnDelegateDelete(delegate);
First, calling TfLiteQnnDelegateOptionsDefault() function will return a
structure containing all the options for the delegate, with default values. The
TfLiteQnnDelegateOptions.backend_type option is mandatory and is set to the HTP backend in this example.
The options structure is used to instantiate the delegate by calling the TfLiteQnnDelegateCreate()
function. Next, the delegate object is passed into the interpreter to delegate
the model’s nodes. Finally, the delegate can be deleted using the
TfLiteQnnDelegateDelete() function once no longer required. See the
C Interface page for more information on all enums,
structures, and functions.
Qualcomm® AI Engine Direct Delegate Interface is generated from Android NDK r25c.
We recommend to avoid flags like -fshort-enums to conform to application binary interfaces.
Note that this interface is preferred over the External Delegate Interface.
External Delegate Interface¶
Warning
Please don’t mix Qualcomm® AI Engine Direct Delegate Interface and External Delegate Interface in a single process, which can confuse Qualcomm® AI Engine Direct Delegate instances. This implies C-APIs in QnnTFLiteDelegate.h cannot be used by an application using External Delegate Interface.
The TFLite External Delegate is a wrapper that can dynamically load other delegates. Using the External Delegate interface, there is no static dependency between the C/C++ application and the delegate being loaded. Thus a change to the delegate does not require a recompilation of the application. Since the Qualcomm® AI Engine Direct Delegate is dynamically loaded in this method, one drawback is that all delegate options must be passed as key value strings. More information on TFLite’s External Delegate mechanism can be found on the TFLite External Delegate GitHub page.
In addition to the TensorFlow Lite Android C API, the External Delegate has its own header file and library that must be integrated into the application. Setup the TensorFlow source environment and all required tools and dependencies. Again, check release notes for the supported TensorFlow version. To build the external delegate, run the following command:
bazel build -c opt --config=android_arm64 tensorflow/lite/delegates/external:external_delegate
This will create the external_delegate.h header file as well as the libexternal_delegate.so shared library. These need to be integrated into the C/C++ Android application in the same way as the TFLite Android C API.
The same libQnnTFLiteDelegate.so library is also compatible with the External Delegate interface. The External Delegate method will be employed within the Tutorials section in order to leverage precompiled TFLite applications.
The code snippet below shows how to load the Qualcomm® AI Engine Direct Delegate through the External Delegate mechanism and run inference. Notice that the External Delegate header is included inplace of the Qualcomm® AI Engine Direct Delegate header.
#include "tensorflow/lite/delegates/external/external_delegate.h"
// Setup interpreter with .tflite model.
// Create external delegate option and pass the QNN Delegate library
// location so it can be dynamically loaded. Here we assume the delegate is
// located at the path /data/local/tmp/qnn_delegate/.
TfLiteExternalDelegateOptions external_delegate_options =
TfLiteExternalDelegateOptionsDefault("/data/local/tmp/qnn_delegate/libQnnTFLiteDelegate.so");
// Add QNN Delegate specific options.
external_delegate_options.insert(&external_delegate_options, "backend_type", "htp");
// Create the External Delegate. This will load the QNN Delegate.
TfLiteDelegate* external_delegate = TfLiteExternalDelegateCreate(&external_delegate_options);
// Add External Delegate into TFLite Interpreter to automatically delegate nodes.
interpreter->ModifyGraphWithDelegate(external_delegate);
// Perform inference with Interpreter using TFLite as usual.
// Delete external delegate at end of app.
TfLiteExternalDelegateDelete(external_delegate);
Instead of directly instantiating a Qualcomm® AI Engine Direct Delegate object, the application creates an External Delegate. The External Delegate acts as wrapper that dynamically loads the Qualcomm® AI Engine Direct Delegate library and all relevant symbols. The specific Qualcomm® AI Engine Direct Delegate options are passed as key-value string. See the External Delegate Options for a list of all available option strings.
On Device Environment Setup¶
Once an application has integrated the Qualcomm® AI Engine Direct Delegate, either using the
Qualcomm® AI Engine Direct Delegate interface or the External Delegate interface, the required libraries
must be pushed from the SDK to the device, using ADB. The path
/data/local/tmp/qnn_delegate will be used as an example here.
$ adb shell mkdir -p /data/local/tmp/qnn_delegate
$ # Push the QNN Delegate library
$ adb push $QNN_SDK_ROOT/lib/aarch64-android/libQnnTFLiteDelegate.so /data/local/tmp/qnn_delegate
Next, push Qualcomm® AI Engine Direct backend libraries. This page describes supported libraries.
You might want to push all or part of them to the device. Please see tutorial for qtld-net-run.
$ # Push QNN backend ARM libraries
$ adb push $QNN_SDK_ROOT/lib/aarch64-android/libQnn*.so /data/local/tmp/qnn_delegate/
$ # Push Hexagon (eg. QNN HTP) backends, push the Skel:
$ adb push $QNN_SDK_ROOT/lib/hexagon-v<version>/unsigned/libQnn*.so /data/local/tmp/qnn_delegate/
The following environment variables must also be set on the device each time the delegate is invoked. Note the quotes when setting ADSP_LIBRARY_PATH.
$ export LD_LIBRARY_PATH=/data/local/tmp/qnn_delegate/:$LD_LIBRARY_PATH
$ export ADSP_LIBRARY_PATH="/data/local/tmp/qnn_delegate/"
Note that the ADSP_LIBRARY_PATH needs to be set only if the Skel library is
required. For convenience, the delegate provides the
TfLiteQnnDelegateOptions.skel_library_dir option, which can set the
$ADSP_LIBRARY_PATH variable for the user.
See the Frequently Asked Questions page for common issues.
Concurrent Setup¶
One delegate instance can only be used for a single specific interpreter. But it is possible to have multiple interpreters in a single process.
Warning
Due to HW resource limitation, the number of concurrent HTP or DSP backend varies with the chip. If you find some errors related to rpc please reduce the number of concurrent HTP or DSP delegate instances.
For example,
#include "QNN/TFLiteDelegate/QnnTFLiteDelegate.h"
// Setup interpreter_0 with model_0.tflite.
// ...
// Create QNN Delegate options structure.
TfLiteQnnDelegateOptions options_0 = TfLiteQnnDelegateOptionsDefault();
// Set the mandatory backend_type option. All other options have default values.
options_0.backend_type = kHtpBackend;
TfLiteDelegate* delegate0 = TfLiteQnnDelegateCreate(&options_0);
// Register delegate_0 with interpreter_0.
// Note that after this line, delegate_0 cannot be applied for any other
// interpreter.
interpreter_0->ModifyGraphWithDelegate(delegate_0);
// Perform inference with interpreter_0 as usual.
// We can have another delegate_1 for another interpreter_1
// Setup interpreter_1 with model_1.tflite.
// ...
// Craete another delegate_1
TfLiteQnnDelegateOptions options_1 = TfLiteQnnDelegateOptionsDefault();
options_1.backend_type = kHtpBackend;
TfLiteDelegate* delegate1 = TfLiteQnnDelegateCreate(&options_1);
interpreter_1->ModifyGraphWithDelegate(delegate_1);
// Perform inference with interpreter_0 and interpreter_1 as usual.