Java Interface

Classes

class QnnDelegate : public Delegate, AutoCloseable
long getNativeHandle()

Get handle to the delegate.

void close()

override method

Calls to Close() frees C-runtime resources TFLite.

public static int[] getVersion()

Get QNN Delegate API version.

void performanceVote()

Manual vote for performance. See HtpPerfCtrlStrategy option.

void performanceRelease()

Manual release the vote for performance. See HtpPerfCtrlStrategy option.

boolean isAvailable()

Check if the QnnDelegate is available on the platform.

For checking backend capability only, please refer to checkCapability().

static boolean checkCapability(Capability cap)

Check if the input capability is supported by the platform.

Note that it can be called prior to the construction of QnnDelegate class.

enum class Capability

Lists the supported runtime types of the current platform.

enumerator HTP_RUNTIME_QUANTIZED = 0

HTP hardware accelerator with quantized parameters.

enumerator HTP_RUNTIME_FP16 = 1

HTP hardware accelerator with both quantized and half-precision floating-point parameters.

enumerator GPU_RUNTIME = 2

GPU hardware accelerator.

enumerator DSP_RUNTIME = 3

DSP hardware accelerator.

enumerator INVALID = 99

Invalid option.

QnnDelegate(Options options)

QnnDelegate constructor.

This creates QnnDelegate object by passing in the options, throws Exception if QnnDelegate is not available.

Throw

UnsupportedOperationException

class Options

final static class

Options()
enum class BackendType

The QNN backend used to delegate the model’s nodes. Each backend has its own set of supported ops and tensor types.

enumerator UNDEFINED_BACKEND = 0
enumerator GPU_BACKEND = 1

Backend for AdrenoTM GPU hardware accelerator.

enumerator HTP_BACKEND = 2

Backend for Hexagon HTP hardware accelerator.

enumerator DSP_BACKEND = 3

Backend for Hexagon DSP hardware accelerator.

enum class LogLevel

Logging level of the delegate and QNN backend.

enumerator LOG_OFF = 0

Disable delegate and QNN backend logging messages.

enumerator LOG_LEVEL_ERROR = 1
enumerator LOG_LEVEL_WARN = 2
enumerator LOG_LEVEL_INFO = 3
enumerator LOG_LEVEL_VERBOSE = 4
enumerator LOG_LEVEL_DEBUG = 5
enum class ProfilingOptions

Options to profile the QNN Delegate execution.

enumerator PROFILING_OFF = 0
enumerator BASIC_PROFILING = 1
enumerator DETAILED_PROFILING = 2
enumerator LINTING_PROFILING = 3
enum class GpuPrecision

Defines the optimization levels of the graph tensors that are not input nor output tensors. This enum controls the trade-off between performance and accuracy.

enumerator GPU_PRECISION_USER_PROVIDED = 0
enumerator GPU_PRECISION_FP32 = 1
enumerator GPU_PRECISION_FP16 = 2
enumerator GPU_PRECISION_HYBRID = 3
enum class GpuPerformanceMode

Defines performance modes available for Gpu backend.

enumerator GPU_PERFORMANCE_DEFAULT = 0
enumerator GPU_PERFORMANCE_HIGH = 1
enumerator GPU_PERFORMANCE_NORMAL = 2
enumerator GPU_PERFORMANCE_LOW = 3
enum class HtpPerformanceMode

Defines performance modes available for HTP backend.

enumerator PERFORMANCE_DEFAULT = 0
enumerator PERFORMANCE_SUSTAINED_HIGH_PERFORMANCE = 1
enumerator PERFORMANCE_BURST = 2
enumerator PERFORMANCE_HIGH_PERFORMANCE = 3
enumerator PERFORMANCE_POWER_SAVER = 4
enumerator PERFORMANCE_LOW_POWER_SAVER = 5
enumerator PERFORMANCE_HIGH_POWER_SAVER = 6
enumerator PERFORMANCE_LOW_BALANCE = 7
enumerator PERFORMANCE_BALANCE = 8
enum class HtpPrecision

Defines the precision modes supported by the HTP backend. HTP_PRECISION_FP16 supports quantized networks, and also floating-point networks on certain SoCs. Default: HTP_PRECISION_FP16

enumerator HTP_PRECISION_QUANTIZED = 0
enumerator HTP_PRECISION_FP16 = 1
enum class HtpUseConvHmx

Using short conv hmx might have better performance. However, convolution having short depth and/or weights that are not symmetric could exhibit inaccurate results.(Default is HTP_CONV_HMX_ON = 1)

enumerator HTP_CONV_HMX_OFF = 0
enumerator HTP_CONV_HMX_ON = 1
enum class HtpUseFoldRelu

Using fold relu might have better performance. However, this optimization is correct when the quantization ranges of convolutions are equal or subset of Relu operations.

enumerator HTP_FOLD_RELU_OFF = 0
enumerator HTP_FOLD_RELU_ON = 1
enum class HtpOptimizationStrategy

Defines the optimization strategy used by the HTP backend.

enumerator HTP_OPTIMIZE_FOR_INFERENCE = 0
enumerator HTP_OPTIMIZE_FOR_PREPARE = 1
enumerator HTP_OPTIMIZE_FOR_INFERENCE_O3 = 2
enum class DspPerformanceMode

Defines performance modes available for DSP backend.

enumerator PERFORMANCE_DEFAULT = 0
enumerator PERFORMANCE_SUSTAINED_HIGH_PERFORMANCE = 1
enumerator PERFORMANCE_BURST = 2
enumerator PERFORMANCE_HIGH_PERFORMANCE = 3
enumerator PERFORMANCE_POWER_SAVER = 4
enumerator PERFORMANCE_LOW_POWER_SAVER = 5
enumerator PERFORMANCE_HIGH_POWER_SAVER = 6
enumerator PERFORMANCE_LOW_BALANCE = 7
enumerator PERFORMANCE_BALANCE = 8
enum class HtpPerfCtrlStrategy

Defines HTP performance control strategy.

Manual: The performance mode is voted as the backend is initialized, and released at the moment of the backend is destroyed.

Users can control the vote/release of the performance mode by instance methods performanceVote() and performanceRelease().

Note that this is the default strategy.

For example, users can vote before inferences, and release after all inference done.

QnnDelegate qnnDelegate = new QnnDelegate(htpOptions);
// initialize other stuff...
qnnDelegate.performanceVote();
// invoke inferences...
qnnDelegate.performanceRelease();

AUTO: QNN Delegate vote before an inference, and release after an idle interval, i.e.no more inference.

enumerator HTP_PERF_CTRL_MANUAL = 0
enumerator HTP_PERF_CTRL_AUTO = 1
enum class DspPerfCtrlStrategy

Defines DSP performance control strategy. Similar to HTP cases.

enumerator DSP_PERF_CTRL_MANUAL = 0
enumerator DSP_PERF_CTRL_AUTO = 1
enum class HtpPdSession

Defines pd sessions available for HTP backend.

enumerator HTP_PD_SESSION_UNSIGNED = 0
enumerator HTP_PD_SESSION_SIGNED = 1
enum class DspPdSession

Defines pd sessions available for DSP backend.

enumerator DSP_PD_SESSION_UNSIGNED = 0
enumerator DSP_PD_SESSION_SIGNED = 1
enumerator DSP_PD_SESSION_ADAPTIVE = 2
class OpPackageInfo

static final

Class containing the information needed to register and use an op package with QNN.

OpPackageInfo(String name, String path, String provider, String Target, HashMap<String, String> opsMap)

OpPackageInfo constructor.

Parameters
  • String name – The name of the op package

  • String path – The path on disk to the op package library

  • String provider – The name of a function in the op package library which satisfies the QnnOpPackage_InterfaceProvider_t interface

  • String target – The target which this op package library was compiled for

  • HashMap<String, String> opsMap – Map of TFLite custom operator name to op type defined within an op package

int getBackendType()

Getter function to retrieve the integer that represents the BackendType.

String getLibraryPath()

Getter function to retrieve library path.

String getSkelLibraryDir()

Getter function to retrieve Skel library.

int getGpuPrecision()

Getter function to retrieve the integer that represents the GpuPrecision.

int getGpuPerformanceMode()

Getter function to retrieve the integer that represents the GpuPerformanceMode.

int getHtpPerformanceMode()

Getter function to retrieve the integer that represents the HtpPerformanceMode.

int getHtpPrecision()

Getter function to retrieve the integer that represents the HtpPrecision.

int getHtpUseConvHmx()

Getter function to retrieve the integer that represents the HtpUseConvHmx.

int getHtpUseFoldRelu()

Getter function to retrieve the integer that represents the HtpUseFoldRelu.

int getHtpPdSession()

Getter function to retrieve the integer that represents the HtpPdSession.

int getHtpOptimizationStrategy()

Getter function to retrieve the integer that represents the HtpOptimizationStrategy.

int getHtpPerfCtrlStrategy()

Getter function to retrieve the integer that represents the HtpPerfCtrlStrategy.

int getDspPerformanceMode()

Getter function to retrieve the integer that represents the DspPerformanceMode.

int getDspPdSession()

Getter function to retrieve the integer that represents the DspPdSession.

int getDspPerfCtrlStrategy()

Getter function to retrieve the integer that represents the DspPerfCtrlStrategy.

int getLogLevel()

Getter function to retrieve the integer that represents the LogLevel.

int getProfiling()

Getter function to retrieve the integer that represents the ProfilingOptions.

String getCacheDir()

Getter function to retrieve the cache directory.

String getModelToken()

Getter function to retrieve the model token.

int getOpPackageInfosLength()

Getter function to retrieve the number of OpPackageInfo.

public OpPackageInfo[] getOpPackageInfos()

Getter function to retrieve the OpPackageInfo array.

public int[] getSkipNodeIds()

Getter function to retrieve the arrays containing node ids to skip.

public int[] getSkipOps()

Getter function to retrieve the arrays containing node ops to skip.

void setBackendType(BackendType input)

The backend QNN library to open and execute the graph with. This is a required argument and will error out if kUndefinedBackend is supplied.

void setLibraryPath(String input)

Optional parameter to override the QNN backend library.

void setSkelLibraryDir(String input)

Optional parameter to directory of the QNN Skel library. Only useful for backends which have a Skel library.

void setGpuOptions(GpuPrecision input)

Optional backend specific options for the GPU backend. Only used when selecting kGpuBackend, else will be ignored.

Warning

setGpuOptions() is deprecated, and will be removed in the future release. Instead, please use below individual setters for each options.

void setGpuPerformanceMode(GpuPerformanceMode mode)

Set GPU Performance mode.

void setGpuPrecision(GpuPrecision precision)

Set GPU Precision.

void setHtpOptions(HtpPerformanceMode mode, HtpPrecision precision, HtpPdSession pdSession, HtpOptimizationStrategy optimizationStrategy)

Optional backend specific options for the HTP backend. Only used when selecting HTP_BACKEND, else will be ignored.

Warning

setHtpOptions() is deprecated, and will be removed in the future release. Instead, please use below individual setters for each options.

void setHtpPerformanceMode(HtpPerformanceMode mode)

Set HTP Performance mode.

boolean enableHtpQuickResponse()

This option enhances the HTP response time in performance_mode == burst. However, it requires more power wattage than the usual burst mode. HTP quick response mode will be enabled if at least one delegate is in burst mode and enableHtpQuickResponse() is called. Return value can be used to check if HTP quick response is enabled successfully.

void setHtpPrecision(HtpPrecision precision)

Set HTP Precision.

void setHtpPdSession(HtpPdSession pdSession)

Set HTP PdSession.

void setHtpOptimizationStrategy(HtpOptimizationStrategy optimizationStrategy)

Set HTP Optimization Strategy.

void setHtpPerfCtrlStrategy(HtpPerfCtrlStrategy perfCtrl)

Set HTP Performance control strategy.

void setDspOptions(DspPerformanceMode mode, DspPdSession pdSession)

Optional backend specific options for the DSP backend. Only used when selecting DSP_BACKEND, else will be ignored.

Warning

setDspOptions() is deprecated, and will be removed in the future release. Instead, please use below individual setters for each options.

void setDspPerformanceMode(DspPerformanceMode mode)

Set DSP Performance mode.

void setDspPdSession(DspPdSession pdSession)

Set DSP PdSession.

void setDspPerfCtrlStrategy(DspPerfCtrlStrategy perfCtrl)

Set DSP Performance control strategy.

void setLogLevel(LogLevel input)

Logging level of the delegate and the backend. Default is off.

void setProfiling(ProfilingOptions input)

Option to enable profiling with the delegate. Default is off.

public void setOpPackageOptions(OpPackageInfo[] info)

Optional structure to specify custom op packages for the backend.

void setCacheDir(String input)

Specifies the directory of a compiled model. Signals intent to either:

  • Save the model if the file doesn’t exist, or

  • Restore model from the file.

Model Cache specific options. Only used when setting setModelToken, else will be ignored.

void setModelToken(String input)

The unique nul-terminated token string that acts as a ‘namespace’ for all serialization entries. Should be unique to a particular model (graph & constants).

Model Cache specific options. Only used when setting setCacheDir, else will be ignored.

public void setSkipNodeIds(int[] ids)

Set nodes not to be delegated by their node ids. Node ids can be obtained from the nodes’ location information in .tflite files.

public void setSkipOps(int[] ops)

Set ops not to be delegated by their built-in id(s). Please refer to tensorflow/lite/builtin_ops.h for the op built-in ids. Note that we skip ALL same type of ops specified. For example, if you skip SquaredDifference in your model, all of the SquaredDifference ops in the model will not be delegated.

public byte[] getProfilingResult()

Get the profiling result, after profiling level is set. Note that the ownership of the returned byte array is transferred to user. Empty result will be returned from the second time on if this function is called consecutively.