Function QnnGraph_executeAsync

Function Documentation

Qnn_ErrorHandle_t QnnGraph_executeAsync(Qnn_GraphHandle_t graphHandle, const Qnn_Tensor_t *inputs, uint32_t numInputs, Qnn_Tensor_t *outputs, uint32_t numOutputs, Qnn_ProfileHandle_t profileHandle, Qnn_SignalHandle_t signalHandle, Qnn_NotifyFn_t notifyFn, void *notifyParam)

Asynchronously execute a finalized graph. Graphs will be enqueued for execution in FIFO order. There is no guarantee that graphs will finish execution in the same order they were enqueued. If the the execution queue is full, this function will block until space is available.

Parameters
  • graphHandle[in] Handle of finalized graph to execute.

  • inputs[in] Array of input tensors with which to populate graph inputs.

  • numInputs[in] Number of input tensors.

  • outputs[out] Array of tensors which the graph will populate with output values.

  • numOutputs[in] Number of output tensors.

  • profileHandle[in] The profile handle on which metrics is populated and can be queried. Use NULL handle to disable profile collection. A handle being reused would reset and is populated with values from the enqueued execute call. Profile handle management/reuse across asynchronous calls is client’s responsibility. Behavior is undefined if same profile handle is used by two enqueued execute instances at the same time. This handle must be NULL when a continuous profile handle has been configured via the QNN_GRAPH_CONFIG_OPTION_PROFILE_HANDLE option

  • signalHandle[in] Signal object which may be used to control the execution of this call. NULL indicates execution should proceed as normal. All pending executions in the queue are affected by Signal control. Instance executing when Signal control is issued may not be affected. The signal object, if not NULL, is considered to be in-use for the duration of the call. For timeout signals, the timeout duration applies from the QnnGraph_executeAsync call until the callback is called. The same Qnn_GraphHandle_t can be used for multiple calls to QnnGraph_executeAsync, however, different Qnn_SignalHandle_t must be supplied.

  • notifyFn[in] Pointer to notification function, called when execution is finished. NULL indicates no notification is requested. notifyFn will be called in context of backend owned thread, with priority equal or lower than client’s calling thread. Please note that a failed call to QnnGraph_executeAsync does not call the notification function.

  • notifyParam[in] Client-supplied data object which will be passed back via notifyFn and can be used to identify asynchronous execution instance. Can be NULL.

Returns

Error code:

  • QNN_SUCCESS: the graph was successfully executed

  • QNN_GRAPH_ERROR_INVALID_HANDLE: graph is not a valid handle

  • QNN_GRAPH_ERROR_GRAPH_NOT_FINALIZED: graph was not finalized

  • QNN_GRAPH_ERROR_SUBGRAPH: cannot execute a subgraph

  • QNN_GRAPH_ERROR_INVALID_ARGUMENT:

    • inputs or outputs is NULL or ill-formed OR

    • inputs is NOT NULL and numInputs is 0 OR

    • outputs is NOT NULL and numOutputs is 0 OR

    • profile handle is invalid OR

    • continuous graph profiling is enabled and the per-API handle is not NULL.

  • QNN_GRAPH_ERROR_INVALID_TENSOR: one or more tensors in inputs or outputs is invalid or not recognized by graph

  • QNN_GRAPH_ERROR_UNSUPPORTED_FEATURE: asynchronous graph execution is not supported on this backend or some API feature is not supported yet, e.g. signal, profile, or batch multiplier

  • QNN_GRAPH_ERROR_SIGNAL_IN_USE: the supplied control signal is already in-use by another call.

  • QNN_GRAPH_ERROR_ABORTED: the call is aborted before completion due to user cancellation

  • QNN_GRAPH_ERROR_TIMED_OUT: the call is aborted before completion due to a timeout

  • QNN_GRAPH_ERROR_DISABLED: the graph was not enabled when the context was deserialized

  • QNN_GRAPH_ERROR_DYNAMIC_TENSOR_SHAPE: An error occurred that is related to dynamic tensor shape. For example, a tensor maximum dimension was exceeded.

  • QNN_GRAPH_ERROR_TENSOR_SPARSITY: An error occurred that is related to tensor sparsity. For example, the maximum number of specified elements was exceeded.

  • QNN_GRAPH_ERROR_EARLY_TERMINATION: Graph execution terminated early due to defined op behavior.

  • QNN_GRAPH_ERROR_INVALID_CONTEXT: Graph execution failed due to context already being freed.

  • QNN_COMMON_ERROR_SYSTEM_COMMUNICATION: SSR occurence (successful recovery)

  • QNN_COMMON_ERROR_SYSTEM_COMMUNICATION_FATAL: SSR occurence (unsuccessful recovery)

Note

Tensors in inputs and outputs must carry the same ID that was assigned when they were created. Values for all other attributes in Qnn_Tensor_t are assumed from the point at which they were registered with a backend at the time of tensor creation, with the following exceptions:

  • Tensor data provided by client in structs such as clientBuf can be changed between invocations to execute().

  • Batch multiple: An inputs or outputs tensor dimensions field, if non-null, should match the values provided at tensor creation, with the following exception. The batch dimension, as determined by the op definition, can be an integer multiple of the respective dimension provided at tensor creation. All inputs and outputs tensors must have the same batch multiple.

  • Dynamic output dimensions: An outputs tensor Qnn_TensorV1_t dimensions field, if non-null, can vary after graph execution. As determined by the op definition, non-batch dimensions may be less than the respective dimension at tensor creation.

  • Dynamic dimensions: If an inputs tensor was created with a non-null Qnn_TensorV2_t isDynamicDimensions field, the corresponding dynamic dimensions must be provided by the caller. If an outputs tensor was created with a non-null Qnn_TensorV2_t isDynamicDimensions field, the dimensions must be non-null and the output dimensions will be written by the backend. In a scenario where maximum dimensions will be exceeded, the backend will generate an error code indicating loss of data and will fill the tensor with as much data as possible.

  • Other fields like dataType can also be permitted to change between invocations to QnnGraph_execute()/QnnGraph_executeAsync() for certain ops that perform data type conversions.

  • Some backends may be able to execute a graph with no inputs provided the graph has no application-writable tensors.

  • QnnGraph_executeAsync() can only accept tensors of type QNN_TENSOR_TYPE_APP_READ, QNN_TENSOR_TYPE_APP_WRITE, QNN_TENSOR_TYPE_APP_READ_WRITE, QNN_TENSOR_TYPE_OPTIONAL_APP_READ, QNN_TENSOR_TYPE_OPTIONAL_APP_WRITE, and QNN_TENSOR_TYPE_OPTIONAL_APP_READWRITE. Tensors provided with a different type will result in QnnGraph_execute() failure.

  • Clients may exclude tensors of type QNN_TENSOR_TYPE_OPTIONAL_APP_READ, QNN_TENSOR_TYPE_OPTIONAL_APP_WRITE, and QNN_TENSOR_TYPE_OPTIONAL_APP_READ from the inputs and outputs arguments. If a QNN_TENSOR_TYPE_OPTIONAL_APP_WRITE tensor is excluded from the inputs argument, the value of that tensor will be dictated by the backend defined behavior for that model. QNN_TENSOR_TYPE_OPTIONAL_APP_READ tensors may be excluded from the outputs argument. In this case a backend will not populate the tensor on the QnnGraph_execute() call, and the data of these tensors is null. This is an optional feature. Backends broadcast support for this feature with QNN_PROPERTY_TENSOR_SUPPORT_OPTIONAL_APP_WRITE, QNN_PROPERTY_TENSOR_SUPPORT_OPTIONAL_APP_READ, and QNN_PROPERTY_TENSOR_SUPPORT_OPTIONAL_APP_READWRITE.

  • Mixing different tensor versions in the same graph (e.g. Qnn_TensorV1_t and Qnn_TensorV2_t) may result in performance degradation.

Note

If there are simultaneous calls to QnnGraph_execute() and QnnGraph_executeAsync(), the priority for enqueuing or executing is equal. Both functions will add to the same queue, the only difference in behavior is whether the function returns when the execution is enqueued, or when the execution finishes.

Note

Use corresponding API through QnnInterface_t.