External Delegate Options

Since the setup:External Delegate Interface dynamically loads the Qualcomm® AI Engine Direct Delegate, it has no static knowledge of the Qualcomm® AI Engine Direct Delegate options. Instead, the External Delegate passes key value pairs of strings down to the Qualcomm® AI Engine Direct Delegate to be parsed as the options. As a result, the application using the External Delegate Interface requires previous knowledge of the accepted key and value options strings. The following table lists the key value option strings that are available in Qualcomm® AI Engine Direct Delegate.

Option Key

Option Value

Default Value

Mandatory

Description

backend_type

gpu, htp, dsp

N/A

Yes

The backend Qualcomm® AI Engine Direct library to open and execute the graph with. See Acceleration Support for the supported backends.

gpu_precision

0, 1, 2, 3

2 = Float16 for best performance

No

Precision for GPU backend that defines the optimization levels of the graph tensors that are not input nor output tensors.

  • 0 = Obey precisions specified in TFLite graph

  • 1 = Float32

  • 2 = Float16

  • 3 = Hybrid

gpu_performance_mode

0, 1, 2, 3

0 = Default

No

Flag to provide precision modes supported by the GPU backend.

  • 0 = Default

  • 1 = High

  • 2 = Normal

  • 3 = Low

dsp_performance_mode

0, 1, 2, 3, 4, 5, 6, 7, 8

0 = Default

No

When performance_mode is used, the delegate will vote for the provided performance level during inference and will return back to relaxed vote after the inference has completed.

  • 0 = Default

  • 1 = Sustained High Performance

  • 2 = Burst

  • 3 = High Performance

  • 4 = Power Saver

  • 5 = Low Power Saver

  • 6 = High Power Saver

  • 7 = Low Balance

  • 8 = Balance

dsp_pd_session

unsigned, signed, adaptive

unsigned

No

Flag defines the pd session of DSP backend. Unsigned is used for unsigned pd while signed is for signed pd. For those without knowledge about pd, adaptive is a good option to automatically choose the pd session from unsigned and signed pd.

dsp_perf_ctrl_strategy

0, 1

0 = Manual

No

Flag to select DSP perforamnce control strategy. Manual vote the performance while the backend is initialized, and release the performance at the moment of the backend is destroyed.

  • 0 = Manual

  • 1 = Auto

dsp_encoding_mode

0, 1

0 = Static

No

Flag to select DSP encoding mode, only support DSP backend. Dynamic encoding is more precise but sacrifices a bit of performance.

  • 0 = Static

  • 1 = Dynamic

htp_performance_mode

0, 1, 2, 3, 4, 5, 6, 7, 8

0 = Default

No

When performance_mode is used, the delegate will vote for the provided performance level during inference and will return back to relaxed vote after the inference has completed.

  • 0 = Default

  • 1 = Sustained High Performance

  • 2 = Burst

  • 3 = High Performance

  • 4 = Power Saver

  • 5 = Low Power Saver

  • 6 = High Power Saver

  • 7 = Low Balance

  • 8 = Balance

htp_pd_session

unsigned, signed

unsigned

No

Flag defines the pd session of HTP backend.

htp_optimization_strategy

0, 1, 2

0 = Optimize for Inference

No

Flag to select optimization strategy used by the HTP backend. The default optimization strategy will optimize the graph for inference.

  • 0 = Optimize for Inference

  • 1 = Optimize for Prepare

  • 2 = Optimize for InferenceO3

htp_perf_ctrl_strategy

0, 1

0 = Manual

No

Flag to select HTP perforamnce control strategy. Manual vote the performance while the backend is initialized, and release the performance at the moment of the backend is destroyed.

  • 0 = Manual

  • 1 = Auto

htp_use_conv_hmx

0, 1

1 = enable HMX for short depth conv2d.

No

Flag to enable HMX for short depth conv2d. Please see C interface or qnn_delegate.h for details.

  • 0 = Don’t enable HMX for short depth conv2d.

  • 1 = enable HMX for short depth conv2d.

htp_use_fold_relu

0, 1

0 = don’t fuse Relu into conv2d.

No

Flag to enable fusion of Relu into conv2d. Please see C interface or qnn_delegate.h for details.

  • 0 = Don’t enable fusion Relu into conv2d.

  • 1 = enable fusion Relu into conv2d.

htp_device_id

0 or positive integer

0 = Default

No

If the SoC has more than 1 HTP device, you can choose a device by the device id.

library_path

<Path to backend library file>

Default library associated with chosen Qualcomm® AI Engine Direct backend

No

Optional parameter to override the Qualcomm® AI Engine Direct backend library.

log_level

0, 1, 2, 3, 4, 5

0 = Off

No

Logging level of the delegate and the backend between 0-5, higher is more verbose.

  • 0 = Off

  • 1 = Error

  • 2 = Warn

  • 3 = Info

  • 4 = Debug

  • 5 = Verbose

cache_dir

<Full path to a directory>

N/A

No

Specifies the directory of a compiled model. Signals intent to either:

  • Save the model if the file doesn’t exist, or

  • Restore model from the file.

Model Cache specific options. Only used when setting model_token, else will be ignored.

model_token

<model token name>

N/A

No

The unique null-terminated token string that acts as a ‘namespace’ for all serialization entries. Should be unique to a particular model (graph & constants). For an example of how to generate this from a TFLite model, see StrFingerprint() in lite/delegates/serialization.h. Model Cache specific options. Only used when setting cache_dir, else will be ignored.

profiling

0, 1, 2

0 = Off

No

How to profile the delegate execution. Use to add profiling level to TFLite/QNN Profiler object. Qnn Profiler will be used unless Tensorflow Lite benchmark_model is used. However, note that the behavior of profiling is subjected to change in the future.

  • 0 = Profiling Off

  • 1 = Basic Profiling

  • 2 = Per Op Profiling

skel_library_dir

<Path to dir>

N/A

No

Directory of Qualcomm® AI Engine Direct Skel Library. Optional parameter to directory of the Qualcomm® AI Engine Direct Skel library. Only useful for backends which have a Skel library.