Sample App Tutorial

Introduction

This tutorial describes how to build a C++ application using QNN APIs that can execute models created using one of the qnn converters on a Linux host or an Android device, and it describes the working of qnn-sample-app.

Warning

The qnn-sample-app is subject to change without notice.

qnn-sample-app is an example C++ application available with the SDK at ${QNN_SDK_ROOT}/examples/QNN/SampleApp. Where QNN_SDK_ROOT is the path to extracted QNN SDK. This tutorial navigates through the source code of qnn-sample-app showing the workflow of usage of QNN APIs to execute a model.

For creating a C++ application based on QNN APIs, we prescribe the below pattern:

  1. Loading pre-requisite shared libraries

  2. Usage of QNN APIs

  3. Building and running qnn-sample-app

Guide for navigating through qnn-sample-app source code

Things to note before we proceed to navigate through the source code:
  1. Any return statement returning a value of type StatusCode::xxxx is a reference to enums named StatusCode containing appropriate return codes.

  2. QNN_xxxx macros are used for logging various debug messages.

Loading pre-requisite shared libraries

QNN SDK provides various shared libraries to access backends, and applications have to load them as needed to execute a network.

A network in QNN can be created in two ways:
  1. It can be directly built into the user application by using QNN APIs.

  2. QNN converters can be used to produce a shared library of a QNN network.

qnn-sample-app makes use of the second option above. This network can be produced using one of the QNN converters available in the SDK, and further be compiled into a shared library using qnn-model-lib-generator.

Note

For Windows users, please replace all ‘.so’ files with the analogous ‘.dll’ file in the following sections. Please refer to Platform Differences for more details.

Loading a backend

Shared libraries for various backends including CPU, GPU, HTP, and DSP are available in the QNN SDK. Every backend that implements QNN APIs exposes all necessary symbols that can be accessed using dynamic loading mechanism.

Let’s consider a sample backend shared library named libQnnSampleBackend.so, which can be dynamically loaded as shown below:

1 void* libBackendHandle = pal::dynamicloading::dlOpen(
2   "libQnnSampleBackend.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL);
3
4 if (nullptr == libBackendHandle) {
5   QNN_ERROR("Unable to load backend. pal::dynamicloading::dlError(): %s",
6             pal::dynamicloading::dlError());
7   return StatusCode::FAIL_LOAD_BACKEND;
8 }

To load a model as a shared library, let’s consider a sample model shared library named libQnnSampleModel.so, which can be dynamically loaded as shown below:

1 void* libModelHandle = pal::dynamicloading::dlOpen(
2     "libQnnSampleModel.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL);
3
4 if (nullptr == libModelHandle) {
5   QNN_ERROR("Unable to load model. pal::dynamicloading::dlError(): %s",
6             pal::dynamicloading::dlError());
7   return StatusCode::FAIL_LOAD_MODEL;
8 }

Optionally, to create a context from a cached binary and execute graphs, applications can make use of QnnSystem API to retrieve metadata associated with the context. QnnSystem API can be accessed by loading the libQnnSystem.so shared library as shown below:

1 void* systemLibraryHandle = pal::dynamicloading::dlOpen(
2   "libQnnSystem.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL);
3
4 if (nullptr == systemLibraryHandle) {
5   QNN_ERROR("Unable to load system library. pal::dynamicloading::dlError(): %s",
6             pal::dynamicloading::dlError());
7   return StatusCode::FAIL_LOAD_SYSTEM_LIB;
8 }

Resolving symbols in shared libraries

After the shared libraries are successfully loaded, we can proceed to resolve all necessary symbols to access QNN APIs.

The below code snippet shows a template to resolve a symbol in a shared library:

 1 // A generic function to resolve symbols in a library
 2 template <class T>
 3 static inline T resolveSymbol(void* libHandle, const char* symName) {
 4   T ptr = (T)pal::dynamicloading::dlSym(libHandle, symName);
 5   if (ptr == nullptr) {
 6     QNN_ERROR("Unable to access symbol [%s]. pal::dynamicloading::dlError(): %s", symName, pal::dynamicloading::dlError());
 7   }
 8   return ptr;
 9 }
10
11 // Template for resolving a function of type SampleFnHandleType_t
12 typedef ReturnType_t (*SampleFnHandleType_t)(FunctionParameterTypes_t ...);
13 SampleFnHandleType_t sampleFn = nullptr;
14 sampleFnHandle = resolveSymbol<SampleFnHandleType_t>(libBackendHandle, "QnnSample_API");
15 if (nullptr == sampleFnHandle) {
16   // Error code indicating failure in symbol resolution
17   return StatusCode::FAIL_SYM_FUNCTION;
18 }

The below code snippet shows an example of how to resolve an actual QNN API:

 1 /* Resolve the symbol for Qnn_ErrorHandle_t QnnInterface_getProviders(const QnnInterface_t*** providerList,
 2                                                                       uint32_t* numProviders)
 3    API */
 4 typedef Qnn_ErrorHandle_t (*QnnInterfaceGetProvidersFn_t)(const QnnInterface_t*** providerList,
 5                                                           uint32_t* numProviders);
 6
 7
 8 QnnInterfaceGetProvidersFn_t getInterfaceProviders {nullptr};
 9
10 getInterfaceProviders =
11   resolveSymbol<QnnInterfaceGetProvidersFn_t>(libBackendHandle, "QnnInterface_getProviders");
12 if (nullptr == getInterfaceProviders) {
13   return StatusCode::FAIL_SYM_FUNCTION;
14 }

In qnn-sample-app source code, all necessary symbols are resolved and stored in a struct of type QnnFunctionPointers shown below:

 1typedef struct QnnFunctionPointers {
 2  // APIs from model output from converters
 3  // QnnModel_composeGraphs
 4  ComposeGraphsFnHandleType_t composeGraphsFnHandle;
 5  // QnnModel_freeGraphsInfo
 6  FreeGraphInfoFnHandleType_t freeGraphInfoFnHandle;
 7  // QNN Interface function table containing pointers to all necessary QNN APIs
 8  // in a backend
 9  QNN_INTERFACE_VER_TYPE qnnInterface;
10  // QNN System Interface function table containing pointers to all QNN System APIs
11  QNN_SYSTEM_INTERFACE_VER_TYPE qnnSystemInterface;
12} QnnFunctionPointers;

The above structure can be found in ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/src/SampleApp.hpp. The rest of the tutorial will assume a variable named m_qnnFunctionPointers of type QnnFunctionPointers that contains valid function pointers.

Usage of QNN APIs

This section demonstrates the usage of QNN APIs in a client application.

Use QNN Interface to obtain function pointers

QNN Interface mechanism can be used to set up a table of function pointers to QNN APIs in the backend instead of manually resolving symbols to each and every API, which makes resolving symbols easy. QNN Interface can be used as below:

 1QnnInterface_t** interfaceProviders{nullptr};
 2uint32_t numProviders{0};
 3// Query for al available interfaces
 4if (QNN_SUCCESS !=
 5   getInterfaceProviders((const QnnInterface_t***)&interfaceProviders, &numProviders)) {
 6  QNN_ERROR("Failed to get interface providers.");
 7  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
 8}
 9// Check for validity of returned interfaces
10if (nullptr == interfaceProviders) {
11  QNN_ERROR("Failed to get interface providers: null interface providers received.");
12  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
13}
14if (0 == numProviders) {
15  QNN_ERROR("Failed to get interface providers: 0 interface providers.");
16  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
17}
18bool foundValidInterface{false};
19// Loop through all available interface providers and pick the one that suits the current API
20// version
21for (size_t pIdx = 0; pIdx < numProviders; pIdx++) {
22  if (QNN_API_VERSION_MAJOR == interfaceProviders[pIdx]->apiVersion.coreApiVersion.major &&
23      QNN_API_VERSION_MINOR <= interfaceProviders[pIdx]->apiVersion.coreApiVersion.minor) {
24    foundValidInterface                 = true;
25    m_qnnFunctionPointers.qnnInterface = interfaceProviders[pIdx]->QNN_INTERFACE_VER_NAME;
26    break;
27  }
28}
29if (!foundValidInterface) {
30  QNN_ERROR("Unable to find a valid interface.");
31  libBackendHandle = nullptr;
32  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
33}

QNN System Interface can be used to resolve all symbols related to QNN System APIs as shown below:

 1 typedef Qnn_ErrorHandle_t (*QnnSystemInterfaceGetProvidersFn_t)(
 2   const QnnSystemInterface_t*** providerList, uint32_t* numProviders);
 3
 4 QnnSystemInterfaceGetProvidersFn_t getSystemInterfaceProviders{nullptr};
 5 getSystemInterfaceProviders = resolveSymbol<QnnSystemInterfaceGetProvidersFn_t>(
 6   systemLibraryHandle, "QnnSystemInterface_getProviders");
 7 if (nullptr == getSystemInterfaceProviders) {
 8   return StatusCode::FAIL_SYM_FUNCTION;
 9 }
10 QnnSystemInterface_t** systemInterfaceProviders{nullptr};
11 uint32_t numProviders{0};
12 if (QNN_SUCCESS != getSystemInterfaceProviders(
13                      (const QnnSystemInterface_t***)&systemInterfaceProviders, &numProviders)) {
14   QNN_ERROR("Failed to get system interface providers.");
15   return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
16 }
17 if (nullptr == systemInterfaceProviders) {
18   QNN_ERROR("Failed to get system interface providers: null interface providers received.");
19   return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
20 }
21 if (0 == numProviders) {
22   QNN_ERROR("Failed to get interface providers: 0 interface providers.");
23   return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
24 }
25 bool foundValidSystemInterface{false};
26 for (size_t pIdx = 0; pIdx < numProviders; pIdx++) {
27   if (QNN_SYSTEM_API_VERSION_MAJOR == systemInterfaceProviders[pIdx]->systemApiVersion.major &&
28       QNN_SYSTEM_API_VERSION_MINOR <= systemInterfaceProviders[pIdx]->systemApiVersion.minor) {
29     foundValidSystemInterface = true;
30     m_qnnFunctionPointers->qnnSystemInterface =
31       systemInterfaceProviders[pIdx]->QNN_SYSTEM_INTERFACE_VER_NAME;
32     break;
33   }
34 }

Set up logging

Logging can be set up before a backed is initialized and after a backend shared library has been dynamically loaded.

To initialize logging, a callback of type QnnLog_Callback_t has to be defined. An example is defined below:

 1 void logStdoutCallback(const char* fmt,
 2                        QnnLog_Level_t level,
 3                        uint64_t timestamp,
 4                        va_list argp) {
 5   const char* levelStr = "";
 6   switch (level) {
 7   case QNN_LOG_LEVEL_ERROR:
 8     levelStr = " ERROR ";
 9     break;
10   case QNN_LOG_LEVEL_WARN:
11     levelStr = "WARNING";
12     break;
13   case QNN_LOG_LEVEL_INFO:
14     levelStr = "  INFO ";
15     break;
16   case QNN_LOG_LEVEL_DEBUG:
17     levelStr = " DEBUG ";
18     break;
19   case QNN_LOG_LEVEL_VERBOSE:
20     levelStr = "VERBOSE";
21     break;
22   case QNN_LOG_LEVEL_MAX:
23     levelStr = "UNKNOWN";
24     break;
25   }
26   fprintf(stdout, "%8.1fms [%-7s] ", ms, levelStr);
27   vfprintf(stdout, fmt, argp);
28   fprintf(stdout, "\n");
29 }

The above callback can be registered with the backend along with a maximum log level. Sample code to initialize with a max log level of QNN_LOG_LEVEL_INFO:

1 Qnn_LogHandle_t logHandle;
2 if (QNN_SUCCESS !=
3       m_qnnFunctionPointers.qnnInterface.logCreate(logStdoutCallback, QNN_LOG_LEVEL_INFO, &logHandle)) {
4   QNN_ERROR("Unable to initialize logging in the backend.");
5   return StatusCode::FAILURE;
6 }

Initialize backend

Once logging has been successfully initialized, backend can be initialized as shown below:

1 Qnn_BackendHandle_t backendHandle;
2 const QnnBackend_Config_t* backendConfigs;
3 /* Set up any necessary backend configurations */
4 if (QNN_BACKEND_NO_ERROR != m_qnnFunctionPointers.qnnInterface.backendCreate(logHandle,
5                                                                              &backendConfigs,
6                                                                              &backendHandle)) {
7   QNN_ERROR("Could not initialize backend");
8   return StatusCode::FAILURE;
9 }

Initialize Profiling

If profiling is desired, after the backend is initialized, a profile handle can be set up. This profile handle can be used at a later point in any API that supports profiling.

A profile handle can be created in the backend with basic profiling level as shown below:

1 Qnn_ProfileHandle_t profileHandle;
2 if (QNN_PROFILE_NO_ERROR != m_qnnFunctionPointers.qnnInterface.profileCreate(
3                                   backendHandle, QNN_PROFILE_LEVEL_BASIC, &profileHandle)) {
4   QNN_WARN("Unable to create profile handle in the backend.");
5   return StatusCode::FAILURE;
6 }

Create device

Device can be created as shown below:

1Qnn_DeviceHandle_t deviceHandle {nullptr};
2const QnnDevice_Config_t* devConfigArray[] = {&devConfig, nullptr};
3Qnn_ErrorHandle_t ret = m_qnnFunctionPointers.qnnInterface.deviceCreate(logHandle, devConfigArray, &deviceHandle);
4 if (QNN_SUCCESS != ret) {
5   QNN_ERROR("Failed to create device: %u", qnnStatus);
6   return StatusCode::FAILURE;
7 }

Set devConfig as defined here in QNN HTP Backend API

Register op packages

Op packages are way to supply libraries containing ops to backends. They can be registered as shown below:

 1 uint32_t opPackageCount;
 2 char* opPackagePath[opPackageCount];
 3 char* opPackageInterfaceProvider[opPackageCount];
 4 /* Set up required op package paths and interface providers as necessary */
 5 for(uint32_t idx = 0; idx < opPackageCount; idx++) {
 6   if (QNN_BACKEND_NO_ERROR !=
 7         m_qnnFunctionPointers.qnnInterface.backendRegisterOpPackage(backendHandle,
 8                                                                     opPackagePath[idx],
 9                                                                     opPackageInterfaceProvider[idx])) {
10     QNN_ERROR("Could not register Op Package: %s and interface provider: %s",
11             opPackagePath[idx],
12             opPackageInterfaceProvider[idx]);
13     return StatusCode::FAILURE;
14   }
15 }

Create context

A context can be created in a backend as shown below:

 1 Qnn_ContextHandle_t context;
 2 Qnn_DeviceHandle_t deviceHandle {nullptr};
 3 const QnnContext_Config_t* contextConfigs;
 4 /* Set up any context configs that are necessary */
 5 if (QNN_CONTEXT_NO_ERROR !=
 6       m_qnnFunctionPointers.qnnInterface.contextCreate(backendHandle,
 7                                                        deviceHandle,
 8                                                        &contextConfigs,
 9                                                        &context)) {
10   QNN_ERROR("Could not create context");
11   return StatusCode::FAILURE;
12 }

Prepare graphs

qnn-sample-app relies on the output from one of the converters to create a QNN network in the backend. composeGraphsFnHandle is mapped to QnnModel_composeGraphs API in the model shared library, which takes qnn_wrapper_api::GraphInfo_t*** as one of the parameters. The function composeGraphsFnHandle will make necessary calls to the backend to create a network(s). It also writes all necessary information, like information about input and output tensors related to the graph, required to execute a graph into the structure graphsInfo as shown in the following code block:

 1 /* Structure to retrieve information about graphs, like graph name,
 2    details about input and output tensors preset in libQnnSampleModel.so */
 3 qnn_wrapper_api::GraphInfo_t** graphsInfo;
 4 // No. of graphs present in libQnnSampleModel.so
 5 uint32_t graphsCount;
 6 // true to enable intermediate outputs, false for network outputs only
 7 bool debug;
 8 if (qnn_wrapper_api::ModelError_t::MODEL_NO_ERROR !=
 9         m_qnnFunctionPointers.composeGraphsFnHandle(backendHandle,
10                                                     m_qnnFunctionPointers.qnnInterface,
11                                                     context,
12                                                     &graphsInfo,
13                                                     &graphsCount,
14                                                     debug)) {
15   QNN_ERROR("Failed in composeGraphs()");
16   return StatusCode::FAILURE;
17 }

At this point, the context will contain all the graphs that were present in libQnnSampleModel.so.

Finalize Graphs

Graphs that were added in the previous step can be finalized as shown below:

 1 // information about graphs obtained in the previous step
 2 qnn_wrapper_api::GraphInfo_t** graphsInfo;
 3 // No. of graphs obtained in the previous step
 4 uint32_t graphsCount;
 5 /* A valid profile handle if profiling is desired,
 6    nullptr if profiling is not needed */
 7 Qnn_ProfileHandle_t profileHandle;
 8
 9 for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) {
10   if (QNN_GRAPH_NO_ERROR !=
11     m_qnnFunctionPointers.qnnInterface.graphFinalize(
12         (*graphsInfo)[graphIdx].graph, profileBackendHandle, nullptr)) {
13     return StatusCode::FAILURE;
14   }
15   /* Extract profiling information if desired and if a valid handle was supplied to finalize
16      graphs API */
17 }

Save context into a binary

After all the graphs in a context are finalized, the user application may choose to save the context into a binary for future use. The advantage of saving a context is that it can be retrieved in the future for execution of graphs contained within it without having to finalize them again. This will save considerable time for initialization during execution of a network.

The context can be saved as shown below:

 1 // Get the expected size of the buffer from the backend in which the context can be saved
 2 if (QNN_CONTEXT_NO_ERROR !=
 3   m_qnnFunctionPointers.qnnInterface.contextGetBinarySize(context, &requiredBufferSize)) {
 4   QNN_ERROR("Could not get the required binary size.");
 5   return StatusCode::FAILURE;
 6 }
 7
 8 // Allocate a buffer of the required size
 9 saveBuffer = (uint8_t*)malloc(requiredBufferSize * sizeof(uint8_t));
10 if (nullptr == saveBuffer) {
11   QNN_ERROR("Could not allocate buffer to save binary.");
12   return StatusCode::FAILURE;
13 }
14
15 auto status = StatusCode::SUCCESS;
16 uint32_t writtenBufferSize{0};
17 // Pass the allocated buffer and obtain a copy of the context binary written into the buffer
18 if (QNN_CONTEXT_NO_ERROR !=
19   m_qnnFunctionPointers.qnnInterface.contextGetBinary(context,
20                                                       reinterpret_cast<void*>(saveBuffer),
21                                                       requiredBufferSize,
22                                                       &writtenBufferSize)) {
23  QNN_ERROR("Could not get binary.");
24  status = StatusCode::FAILURE;
25 }
26
27 // Check if the supplied buffer size is at least as big as the amount of data witten by the backend
28 if (requiredBufferSize < writtenBufferSize) {
29   QNN_ERROR(
30     "Illegal written buffer size [%d] bytes. Cannot exceed allocated memory of [%d] bytes",
31     writtenBufferSize,
32     requiredBufferSize);
33   status = StatusCode::FAILURE;
34 }
35
36 // Use caching utility to save metadata along with the binary buffer from the backend
37 if (status == StatusCode::SUCCESS &&
38   tools::datautil::StatusCode::SUCCESS != tools::datautil::writeBinaryToFile(outputPath,
39                                                                              saveBinaryName + ".bin",
40                                                                              (uint8_t*)saveBuffer,
41                                                                              writtenBufferSize)) {
42   QNN_ERROR("Could not serialize to file.");
43   status = StatusCode::FAILURE;
44 }

Load context from a cached binary

A context that was saved into a binary, like in the previous step, can be loaded as an alternative to creating a new context every time. The code snippet below demonstrates this step:

 1 auto returnStatus   = StatusCode::SUCCESS;
 2 std::shared_ptr<uint8_t> buffer{nullptr};
 3 uint32_t graphsCount {0};
 4 buffer = std::shared_ptr<uint8_t>(new uint8_t[bufferSize], std::default_delete<uint8_t[]>());
 5 if (!buffer) {
 6     QNN_ERROR("Failed to allocate memory.");
 7     return StatusCode::FAILURE;
 8 }
 9
10 if (tools::datautil::StatusCode::SUCCESS !=
11     tools::datautil::readBinaryFromFile(
12     cachedBinaryPath, reinterpret_cast<uint8_t*>(buffer.get()), bufferSize)
13     QNN_ERROR("Failed to read binary file.");
14     returnStatus = StatusCode::FAILURE;
15 }
16
17 /* Create a QnnSystemContext handle to access system context APIs. */
18 QnnSystemContext_Handle_t sysCtxHandle{nullptr};
19 if (QNN_SUCCESS != m_qnnFunctionPointers.qnnSystemInterface.systemContextCreate(&sysCtxHandle)) {
20   QNN_ERROR("Could not create system handle.");
21   returnStatus = StatusCode::FAILURE;
22 }
23
24 /* Retrieve metadata from the context binary through QNN System Context API. */
25 QnnSystemContext_BinaryInfo_t* binaryInfo{nullptr};
26 uint32_t binaryInfoSize{0};
27 if (StatusCode::SUCCESS == returnStatus &&
28     QNN_SUCCESS != m_qnnFunctionPointers.qnnSystemInterface.systemContextGetBinaryInfo(
29                      sysCtxHandle,
30                      static_cast<void*>(buffer.get()),
31                      bufferSize,
32                      &binaryInfo,
33                      &binaryInfoSize)) {
34     QNN_ERROR("Failed to get context binary info");
35     returnStatus = StatusCode::FAILURE;
36 }
37
38 qnn_wrapper_api::GraphInfo_t** graphsInfo;
39 /* Make a copy of the metadata. */
40 if (StatusCode::SUCCESS == returnStatus &&
41     !copyMetadataToGraphsInfo(binaryInfo, graphsInfo, graphsCount)) {
42   QNN_ERROR("Failed to copy metadata.");
43   returnStatus = StatusCode::FAILURE;
44 }
45
46 /* Release resources associated with previously created QnnSystemContext handle. */
47 m_qnnFunctionPointers.qnnSystemInterface.systemContextFree(sysCtxHandle);
48 sysCtxHandle = nullptr;
49
50 /* readBuffer contains the binary data that was previously obtained from a backend. Pass this
51    cached binary data to the backend to recreate the same context. */
52 if (StatusCode::SUCCESS == returnStatus &&
53     m_qnnFunctionPointers.qnnInterface.contextCreateFromBinary(backendHandle,
54                                                                deviceHandle,
55                                                                (const QnnContext_Config_t**)&contextConfig,
56                                                                reinterpret_cast<void*>(readBuffer),
57                                                                bufferSize,
58                                                                &context,
59                                                                profileBackendHandle)) {
60   QNN_ERROR("Could not create context from binary.");
61   returnStatus = StatusCode::FAILURE;
62 }
63
64 // Optionally, extract profiling numbers if desired
65 if (ProfilingLevel::OFF != m_profilingLevel) {
66   extractBackendProfilingInfo(profileBackendHandle);
67 }
68
69 /* Obtain and save graph handles for each graph present in the context based on the saved graph
70    names in the metadata */
71 if (StatusCode::SUCCESS == returnStatus) {
72   for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) {
73     if (QNN_SUCCESS !=
74         m_qnnFunctionPointers.qnnInterface.graphRetrieve(
75             context, (*graphsInfo)[graphIdx].graphName, &((*graphsInfo)[graphIdx].graph))) {
76       QNN_ERROR("Unable to retrieve graph handle for graph Idx: %d", graphIdx);
77       returnStatus = StatusCode::FAILURE;
78     }
79   }
80 }

Execute graphs

After a context has been created, graphs have been added and finalized, or alternatively, after a context has been retrieved from a binary, one or more graphs in the context can be executed.

Executing a graph involves:

  1. Setting up input and output tensors.

  2. Populating input data into input tensors.

  3. Calling the execute method in the backend.

  4. Obtaining outputs and saving them.

This is demonstrated using the code snippet below:

 1 // Select a graph from graphsInfo if there are more than one graph in this context
 2 uint32_t graphIdx;
 3 QNN_DEBUG("Starting execution for graphIdx: %d", graphIdx);
 4 Qnn_Tensor_t* inputs  = nullptr;
 5 Qnn_Tensor_t* outputs = nullptr;
 6 // IOTensor utility is used to set up input and output tensor structures
 7 if (iotensor::StatusCode::SUCCESS !=
 8       ioTensor.setupInputAndOutputTensors(&inputs, &outputs, (*graphsInfo)[graphIdx])) {
 9   QNN_ERROR("Error in setting up Input and output Tensors for graphIdx: %d", graphIdx);
10   returnStatus = StatusCode::FAILURE;
11   break;
12 }
13
14 // Grab input raw file paths to read input data
15 auto inputFileList = inputFileLists[graphIdx];
16 auto graphInfo     = (*graphsInfo)[graphIdx];
17 if (!inputFileList.empty()) {
18   /* *qnn-sample-app* reads data based on the batch size until the whole buffer is filled.
19      If there isn't sufficient data, it pads the rest with zeroes. */
20   size_t totalCount = inputFileList[0].size();
21   while (!inputFileList[0].empty()) {
22     size_t startIdx = (totalCount - inputFileList[0].size());
23
24     // IOTensor utility is used to populate input tensors with input data
25     if (iotensor::StatusCode::SUCCESS !=
26           m_ioTensor.populateInputTensors(
27             graphIdx, inputFileList, inputs, graphInfo, inputDataType)) {
28       returnStatus = StatusCode::FAILURE;
29     }
30
31     if (StatusCode::SUCCESS == returnStatus) {
32       // Execute the graph in the backend with optional profile handle
33       QNN_DEBUG("Successfully populated input tensors for graphIdx: %d", graphIdx);
34       Qnn_ErrorHandle_t executeStatus = QNN_GRAPH_NO_ERROR;
35       executeStatus = m_qnnFunctionPointers.qnnInterface.graphExecute(graphInfo.graph,
36                                                                       inputs,
37                                                                       graphInfo.numInputTensors,
38                                                                       outputs,
39                                                                       graphInfo.numOutputTensors,
40                                                                       profileBackendHandle,
41                                                                       nullptr);
42       if (QNN_GRAPH_NO_ERROR != executeStatus) {
43         returnStatus = StatusCode::FAILURE;
44       }
45       if (StatusCode::SUCCESS == returnStatus) {
46         QNN_DEBUG("Successfully executed graphIdx: %d ", graphIdx);
47         // IOTensor utility is used to write output tensors to raw files
48         if (iotensor::StatusCode::SUCCESS !=
49               ioTensor.writeOutputTensors(graphIdx,
50                                           startIdx,
51                                           graphInfo.graphName,
52                                           outputs,
53                                           graphInfo.outputTensors,
54                                           graphInfo.numOutputTensors,
55                                           outputDataType,
56                                           graphsCount,
57                                           outputPath)) {
58             returnStatus = StatusCode::FAILURE;
59           }
60         }
61       }
62       if (StatusCode::SUCCESS != returnStatus) {
63         QNN_ERROR("Execution of Graph: %d failed!", graphIdx);
64         break;
65       }
66     }
67   }
68
69   // Clean up all the tensors after execution is completed
70   ioTensor.tearDownInputAndOutputTensors(
71       inputs, outputs, graphInfo.numInputTensors, graphInfo.numOutputTensors);
72   inputs  = nullptr;
73   outputs = nullptr;
74   if (StatusCode::SUCCESS != returnStatus) {
75     break;
76   }
77 }

IOTensor is a utility provided with the source code at ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/src/Utils/IOTensor.cpp. It exposes a few methods that help with the execution of a graph, which were used in the previous code snippet:

  1. setupInputAndOutputTensors to set up structures related to input and output tensors.

  2. populateInputTensors to copy input data into input tensor structures.

  3. tearDownInputAndOutputTensors to clean up resources associated with input and output tensors.

Refer to the IOTensor source code for more details about these APIs.

Free context

After all the execution is completed, the context can be freed as shown below:

1 if (QNN_CONTEXT_NO_ERROR !=
2       m_qnnFunctionPointers.qnnInterface.contextFree(context, profileBackendHandle)) {
3   QNN_ERROR("Could not free context");
4   return StatusCode::FAILURE;
5 }

Terminate backend

Backend can be terminated as shown below:

1 if (QNN_BACKEND_NO_ERROR != m_qnnFunctionPointers.qnnInterface.backendFree(backendHandle)) {
2   QNN_ERROR("Could not free backend");
3   return StatusCode::FAILURE;
4 }

Building and running qnn-sample-app

Setup

Linux

Building qnn-sample-app has two external dependencies:
  1. clang compiler

  2. ndk-build (for Android targets only)

If the clang compiler is not available in your system PATH, the script ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh provided with the SDK can be used to install and prepare your environment. Alternatively, you could install these dependencies and make them available in your PATH.

Command to automatically install required dependencies:

1 $ sudo bash ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh

For the second dependency to be satisfied, ndk-build needs to be set using general/setup:Compiler Toolchains

1 $ ${QNN_SDK_ROOT}/bin/envcheck -n

Note: qnn-sample-app has been verified to work with Android NDK version r25c.

GCC Toolchain

For building qnn-sample-app to run on devices with Yocto based OS, gcc compiler is needed. To support Yocto Kirkstone based devices, the SDK libraries are compiled with GCC-11.2. Following section provides steps to acquire the toolchain taking Yocto Kirkstone as an example.

If the required compiler is not available in your system PATH, please use the below steps to install the dependency and make them available in your PATH.

Please follow Qualcomm build guide to generate the eSDK that contains cross compiler toolchain (qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh) required to build sample application.

  1. Steps to build the eSDK are available at https://docs.qualcomm.com/bundle/resource/topics/80-70014-2/build_procedures.html
    1.1. After building the eSDK, qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh will

    be generated at <WORKSPACE DIR>/build-qcom-wayland/tmp-glibc/deploy/sdk.

    1.2. Extract toolchain using ./qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh

Windows

The tutorial assumes general setup instructions have been followed at Setup. Please use “Developer PowerShell for VS 2022” in the following steps.

Hexagon

Building libQnnSampleApp.so has one external dependencies:
  1. hexagon sdk

HEXAGON_SDK_ROOT is the path of the Hexagon SDK installation. Refer HTP and DSP to setup hexagon sdk.

To setup environment:

1 $ source ${HEXAGON_SDK_ROOT}/setup_sdk_env.source

Build

Once the setup is complete, qnn-sample-app can be built as follows:

Linux

1 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
2 $ make all_x86 all_android

After executing make from above, you should be able to see two new folders in the same directory:

  1. bin: contains qnn-sample-app binaries for each platform within respective directories.

  2. obj: contains all the object files that were used for building and linking the executable.

To delete all the artifacts that were generated in the above step, run:

1 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
2 $ make clean

Linux (Yocto Based)

For those devices which have Yocto based Linux OS, GCC compiler needs to be used to build the sample source code. To support Yocto Kirkstone based devices, libraries are compiled with gcc11.2. Please refer below steps for building QNN sample app:

1 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/
2 $ export QNN_AARCH64_LINUX_OE_GCC_112=/path/to/extracted/toolchain
3 $ make CXX="<installed_toolchain_path>/tmp/sysroots/x86_64/usr/bin/aarch64-qcom-linux/aarch64-qcom-linux-g++
4   --sysroot=<installed_toolchain_path>/tmp/sysroots/qcm6490" all_linux_oe_aarch64_gcc112

After executing make from above, you should be able to see two new folders in the same directory:

  1. bin: contains qnn-sample-app binaries for each platform within respective directories.

  2. obj: contains all the object files that were used for building and linking the executable.

To delete all the artifacts that were generated in the above step, run:

1 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
2 $ make clean

Windows

Warning

AsyncExecution and MultiCore features are not supported on the Windows platform. Running sample apps with these features might fail at runtime.

1 $ cd $QNN_SDK_ROOT/examples/QNN/SampleApp/SampleApp
2 $ mkdir build
3 $ cd build
4 $ cmake ../ -A [x64, ARM64]
5 $ cmake --build ./ --config Release

After executing commands from above, you should be able to see $QNN_SDK_ROOT/examples/QNN/SampleApp/SampleApp/build/src/Release/qnn-sample-app.exe

Hexagon

Assuming user desired Hexagon architecture version is v69. To build QnnSampleApp for hexagon use below command

1 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
2 $ make hexagon V=v69

After executing make from above, you should be able to see two new folders in the same directory:

  1. bin: contains libQnnSampleAppv69.so shared library in hexagon directory.

  2. obj: contains all the object files that were used for building and linking the executable.

To delete the artifacts that were generated in the above step, run:

1 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
2 $ make clean_hexagon

Run

Linux

qnn-sample-app executable generated in the build step can be used to execute a model using any QNN backend available for linux-x86_64 and aarch64-android. It is very similar to executing qnn-net-run, except when retrieving a context from a cached binary. To retrieve a cached context, qnn-sample-app additionally needs the QNN System library (libQnnSystem.so) to extract metadata, and it can be provided through the –system_library option. libQnnSystem.so can be found in the SDK for a particular target under lib/<target> folder. Refer to further documentation on qnn-net-run here.

For example, let’s consider execution of the shallow model on CPU backend on a Linux host from Tutorial 1. Replacing qnn-net-run with qnn-sample-app should produce same results:

1$ cd ${QNN_SDK_ROOT}/examples/QNN/converter/models # access input data
2$ ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/bin/x86_64-linux-clang/qnn-sample-app \
3              --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnCpu.so \
4              --model ${QNN_SDK_ROOT}/examples/QNN/example_libs/x86_64-linux-clang/libqnn_model_float.so \
5              --input_list ${QNN_SDK_ROOT}/examples/QNN/converter/models/input_list_float.txt \
6              --op_packages ${QNN_SDK_ROOT}/examples/QNN/OpPackage/CPU/libs/x86_64-linux-clang/libQnnCpuOpPackageExample.so:QnnOpPackage_interfaceProvider

For more tool help, run:

1 $ qnn-sample-app --help

Linux (Yocto Based)

qnn-sample-app executable generated in the build step can be used to execute a model using any QNN backend. To support Yocto kirkstone based devices, backends are available for aarch64-oe-linux-gcc11.2. To run the executable, please refer same steps as LINUX section above.

For more tool help, run:

1 $ qnn-sample-app --help

Windows

Warning

AsyncExecution and MultiCore features are not supported on the Windows platform. Running sample apps with these features might fail at runtime.

qnn-sample-app.exe executable generated in the build step can be used to execute a model using any QNN backend available for windows-x86_64 and aarch64-windows platforms. It is very similar to executing qnn-net-run. Refer to the general QNN documentation available at here to see how to run qnn-net-run. Simply replacing qnn-net-run with qnn-sample-app.exe in the tutorials should help.

For example, let’s consider execution of the Inception_v3 model on CPU backend on a Windows host from Converting and executing a CNN model with QNN. Replacing qnn-net-run with qnn-sample-app.exe should produce same results:

1$ & "<QNN_SDK_ROOT>/bin/envsetup.ps1"
2$ cd $QNN_SDK_ROOT/examples/QNN/converter/models
3$ qnn-sample-app.exe \
4              --backend QnnCpu.dll \
5              --model Inception_v3.dll \
6              --input_list $QNN_SDK_ROOT/examples/QNN/converter/models/input_list_float.txt

For more tool help, run:

1 $ qnn-sample-app.exe --help

Hexagon

libQnnSampleApp69.so shared library generated in the build step can be used to execute a model using QNN backend available for same hexagon architecture. It is executed using run_main_on_hexagon on device.

DEVICE_PATH refer to path on device where required files are pushed.

Push required file on device (Below command are for android device only)

1$ adb push ${HEXAGON_SDK_ROOT}/libs/run_main_on_hexagon/ship/android_aarch64/run_main_on_hexagon /vendor/bin/run_main_on_hexagon
2$ adb push ${HEXAGON_SDK_ROOT}/libs/run_main_on_hexagon/ship/hexagon_toolv87_v69/librun_main_on_hexagon_skel.so /vendor/lib/rfsa/adsp/
3$ adb push ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/bin/hexagon/libQnnSampleAppv69.so ${DEVICE_PATH}
4$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v69/unsigned/libQnnHtpV69.so ${DEVICE_PATH}
5$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v69/unsigned/libQnnSystem.so ${DEVICE_PATH}
6$ adb push qnnmodel.serialized.bin ${DEVICE_PATH}
7$ adb push input_list.txt ${DEVICE_PATH}

Run below command to execute QnnSampleApp on device

Note

run_main_on_hexagon require specification of the DSP domain on which to offload the program, in Hexagon QnnSampleApp case will use cDSP domain which is expressed by numeric domain id 3

1$ cd /vendor/bin
2$ ./run_main_on_hexagon 3 ${DEVICE_PATH}/libQnnSampleAppv69.so \
3               --backend ${DEVICE_PATH}/libQnnHtpV69.so \
4               --system_library ${DEVICE_PATH}/libQnnSystem.so \
5               --retrieve_context ${DEVICE_PATH}/qnnmodel.serialized.bin \
6               --input_list  ${DEVICE_PATH}/input_list.txt