Java Interface¶
Classes¶
-
class QnnDelegate : public Delegate, AutoCloseable¶
-
long getNativeHandle()¶
Get handle to the delegate.
-
void close()¶
override method
Calls to Close() frees C-runtime resources TFLite.
- public static int[] getVersion()
Get QNN Delegate API version.
-
void performanceVote()¶
Manual vote for performance. See HtpPerfCtrlStrategy option.
-
void performanceRelease()¶
Manual release the vote for performance. See HtpPerfCtrlStrategy option.
-
boolean isAvailable()¶
Check if the QnnDelegate is available on the platform.
For checking backend capability only, please refer to checkCapability().
-
static boolean checkCapability(Capability cap)¶
Check if the input capability is supported by the platform.
Note that it can be called prior to the construction of QnnDelegate class.
-
enum class Capability¶
Lists the supported runtime types of the current platform.
-
enumerator HTP_RUNTIME_QUANTIZED = 0¶
HTP hardware accelerator with quantized parameters.
-
enumerator HTP_RUNTIME_FP16 = 1¶
HTP hardware accelerator with both quantized and half-precision floating-point parameters.
-
enumerator GPU_RUNTIME = 2¶
GPU hardware accelerator.
-
enumerator DSP_RUNTIME = 3¶
DSP hardware accelerator.
-
enumerator INVALID = 99¶
Invalid option.
-
enumerator HTP_RUNTIME_QUANTIZED = 0¶
-
QnnDelegate(Options options)¶
QnnDelegate constructor.
This creates QnnDelegate object by passing in the options, throws Exception if QnnDelegate is not available.
- Throw
UnsupportedOperationException
-
class Options¶
final static class
-
Options()¶
-
enum class BackendType¶
The QNN backend used to delegate the model’s nodes. Each backend has its own set of supported ops and tensor types.
-
enumerator UNDEFINED_BACKEND = 0¶
-
enumerator GPU_BACKEND = 1¶
Backend for AdrenoTM GPU hardware accelerator.
-
enumerator HTP_BACKEND = 2¶
Backend for Hexagon HTP hardware accelerator.
-
enumerator DSP_BACKEND = 3¶
Backend for Hexagon DSP hardware accelerator.
-
enumerator UNDEFINED_BACKEND = 0¶
-
enum class LogLevel¶
Logging level of the delegate and QNN backend.
-
enumerator LOG_OFF = 0¶
Disable delegate and QNN backend logging messages.
-
enumerator LOG_LEVEL_ERROR = 1¶
-
enumerator LOG_LEVEL_WARN = 2¶
-
enumerator LOG_LEVEL_INFO = 3¶
-
enumerator LOG_LEVEL_VERBOSE = 4¶
-
enumerator LOG_LEVEL_DEBUG = 5¶
-
enumerator LOG_OFF = 0¶
-
enum class ProfilingOptions¶
Options to profile the QNN Delegate execution.
-
enumerator PROFILING_OFF = 0¶
-
enumerator BASIC_PROFILING = 1¶
-
enumerator DETAILED_PROFILING = 2¶
-
enumerator LINTING_PROFILING = 3¶
-
enumerator PROFILING_OFF = 0¶
-
enum class GpuPrecision¶
Defines the optimization levels of the graph tensors that are not input nor output tensors. This enum controls the trade-off between performance and accuracy.
-
enumerator GPU_PRECISION_USER_PROVIDED = 0¶
-
enumerator GPU_PRECISION_FP32 = 1¶
-
enumerator GPU_PRECISION_FP16 = 2¶
-
enumerator GPU_PRECISION_HYBRID = 3¶
-
enumerator GPU_PRECISION_USER_PROVIDED = 0¶
-
enum class GpuPerformanceMode¶
Defines performance modes available for Gpu backend.
-
enumerator GPU_PERFORMANCE_DEFAULT = 0¶
-
enumerator GPU_PERFORMANCE_HIGH = 1¶
-
enumerator GPU_PERFORMANCE_NORMAL = 2¶
-
enumerator GPU_PERFORMANCE_LOW = 3¶
-
enumerator GPU_PERFORMANCE_DEFAULT = 0¶
-
enum class HtpPerformanceMode¶
Defines performance modes available for HTP backend.
-
enumerator PERFORMANCE_DEFAULT = 0¶
-
enumerator PERFORMANCE_SUSTAINED_HIGH_PERFORMANCE = 1¶
-
enumerator PERFORMANCE_BURST = 2¶
-
enumerator PERFORMANCE_HIGH_PERFORMANCE = 3¶
-
enumerator PERFORMANCE_POWER_SAVER = 4¶
-
enumerator PERFORMANCE_LOW_POWER_SAVER = 5¶
-
enumerator PERFORMANCE_HIGH_POWER_SAVER = 6¶
-
enumerator PERFORMANCE_LOW_BALANCE = 7¶
-
enumerator PERFORMANCE_BALANCE = 8¶
-
enumerator PERFORMANCE_DEFAULT = 0¶
-
enum class HtpPrecision¶
Defines the precision modes supported by the HTP backend. HTP_PRECISION_FP16 supports quantized networks, and also floating-point networks on certain SoCs. Default: HTP_PRECISION_FP16
-
enumerator HTP_PRECISION_QUANTIZED = 0¶
-
enumerator HTP_PRECISION_FP16 = 1¶
-
enumerator HTP_PRECISION_QUANTIZED = 0¶
-
enum class HtpUseConvHmx¶
Using short conv hmx might have better performance. However, convolution having short depth and/or weights that are not symmetric could exhibit inaccurate results.(Default is HTP_CONV_HMX_ON = 1)
-
enumerator HTP_CONV_HMX_OFF = 0¶
-
enumerator HTP_CONV_HMX_ON = 1¶
-
enumerator HTP_CONV_HMX_OFF = 0¶
-
enum class HtpUseFoldRelu¶
Using fold relu might have better performance. However, this optimization is correct when the quantization ranges of convolutions are equal or subset of Relu operations.
-
enumerator HTP_FOLD_RELU_OFF = 0¶
-
enumerator HTP_FOLD_RELU_ON = 1¶
-
enumerator HTP_FOLD_RELU_OFF = 0¶
-
enum class HtpOptimizationStrategy¶
Defines the optimization strategy used by the HTP backend.
-
enumerator HTP_OPTIMIZE_FOR_INFERENCE = 0¶
-
enumerator HTP_OPTIMIZE_FOR_PREPARE = 1¶
-
enumerator HTP_OPTIMIZE_FOR_INFERENCE_O3 = 2¶
-
enumerator HTP_OPTIMIZE_FOR_INFERENCE = 0¶
-
enum class DspPerformanceMode¶
Defines performance modes available for DSP backend.
-
enumerator PERFORMANCE_DEFAULT = 0¶
-
enumerator PERFORMANCE_SUSTAINED_HIGH_PERFORMANCE = 1¶
-
enumerator PERFORMANCE_BURST = 2¶
-
enumerator PERFORMANCE_HIGH_PERFORMANCE = 3¶
-
enumerator PERFORMANCE_POWER_SAVER = 4¶
-
enumerator PERFORMANCE_LOW_POWER_SAVER = 5¶
-
enumerator PERFORMANCE_HIGH_POWER_SAVER = 6¶
-
enumerator PERFORMANCE_LOW_BALANCE = 7¶
-
enumerator PERFORMANCE_BALANCE = 8¶
-
enumerator PERFORMANCE_DEFAULT = 0¶
-
enum class HtpPerfCtrlStrategy¶
Defines HTP performance control strategy.
Manual: The performance mode is voted as the backend is initialized, and released at the moment of the backend is destroyed.
Users can control the vote/release of the performance mode by instance methods performanceVote() and performanceRelease().
Note that this is the default strategy.
For example, users can vote before inferences, and release after all inference done.
QnnDelegate qnnDelegate = new QnnDelegate(htpOptions); // initialize other stuff... qnnDelegate.performanceVote(); // invoke inferences... qnnDelegate.performanceRelease();
AUTO: QNN Delegate vote before an inference, and release after an idle interval, i.e.no more inference.
-
enumerator HTP_PERF_CTRL_MANUAL = 0¶
-
enumerator HTP_PERF_CTRL_AUTO = 1¶
-
enumerator HTP_PERF_CTRL_MANUAL = 0¶
-
enum class DspPerfCtrlStrategy¶
Defines DSP performance control strategy. Similar to HTP cases.
-
enumerator DSP_PERF_CTRL_MANUAL = 0¶
-
enumerator DSP_PERF_CTRL_AUTO = 1¶
-
enumerator DSP_PERF_CTRL_MANUAL = 0¶
-
enum class HtpPdSession¶
Defines pd sessions available for HTP backend.
-
enumerator HTP_PD_SESSION_UNSIGNED = 0¶
-
enumerator HTP_PD_SESSION_SIGNED = 1¶
-
enumerator HTP_PD_SESSION_UNSIGNED = 0¶
-
enum class DspPdSession¶
Defines pd sessions available for DSP backend.
-
enumerator DSP_PD_SESSION_UNSIGNED = 0¶
-
enumerator DSP_PD_SESSION_SIGNED = 1¶
-
enumerator DSP_PD_SESSION_ADAPTIVE = 2¶
-
enumerator DSP_PD_SESSION_UNSIGNED = 0¶
-
class OpPackageInfo¶
static final
Class containing the information needed to register and use an op package with QNN.
-
OpPackageInfo(String name, String path, String provider, String Target, HashMap<String, String> opsMap)¶
OpPackageInfo constructor.
- Parameters
String name – The name of the op package
String path – The path on disk to the op package library
String provider – The name of a function in the op package library which satisfies the QnnOpPackage_InterfaceProvider_t interface
String target – The target which this op package library was compiled for
HashMap<String, String> opsMap – Map of TFLite custom operator name to op type defined within an op package
-
OpPackageInfo(String name, String path, String provider, String Target, HashMap<String, String> opsMap)¶
-
int getBackendType()¶
Getter function to retrieve the integer that represents the BackendType.
-
String getLibraryPath()¶
Getter function to retrieve library path.
-
String getSkelLibraryDir()¶
Getter function to retrieve Skel library.
-
int getGpuPrecision()¶
Getter function to retrieve the integer that represents the GpuPrecision.
-
int getGpuPerformanceMode()¶
Getter function to retrieve the integer that represents the GpuPerformanceMode.
-
int getHtpPerformanceMode()¶
Getter function to retrieve the integer that represents the HtpPerformanceMode.
-
int getHtpPrecision()¶
Getter function to retrieve the integer that represents the HtpPrecision.
-
int getHtpUseConvHmx()¶
Getter function to retrieve the integer that represents the HtpUseConvHmx.
-
int getHtpUseFoldRelu()¶
Getter function to retrieve the integer that represents the HtpUseFoldRelu.
-
int getHtpPdSession()¶
Getter function to retrieve the integer that represents the HtpPdSession.
-
int getHtpOptimizationStrategy()¶
Getter function to retrieve the integer that represents the HtpOptimizationStrategy.
-
int getHtpPerfCtrlStrategy()¶
Getter function to retrieve the integer that represents the HtpPerfCtrlStrategy.
-
int getDspPerformanceMode()¶
Getter function to retrieve the integer that represents the DspPerformanceMode.
-
int getDspPdSession()¶
Getter function to retrieve the integer that represents the DspPdSession.
-
int getDspPerfCtrlStrategy()¶
Getter function to retrieve the integer that represents the DspPerfCtrlStrategy.
-
int getLogLevel()¶
Getter function to retrieve the integer that represents the LogLevel.
-
int getProfiling()¶
Getter function to retrieve the integer that represents the ProfilingOptions.
-
String getCacheDir()¶
Getter function to retrieve the cache directory.
-
String getModelToken()¶
Getter function to retrieve the model token.
-
int getOpPackageInfosLength()¶
Getter function to retrieve the number of OpPackageInfo.
- public OpPackageInfo[] getOpPackageInfos()
Getter function to retrieve the OpPackageInfo array.
- public int[] getSkipNodeIds()
Getter function to retrieve the arrays containing node ids to skip.
- public int[] getSkipOps()
Getter function to retrieve the arrays containing node ops to skip.
-
void setBackendType(BackendType input)¶
The backend QNN library to open and execute the graph with. This is a required argument and will error out if kUndefinedBackend is supplied.
-
void setLibraryPath(String input)¶
Optional parameter to override the QNN backend library.
-
void setSkelLibraryDir(String input)¶
Optional parameter to directory of the QNN Skel library. Only useful for backends which have a Skel library.
-
void setGpuOptions(GpuPrecision input)¶
Optional backend specific options for the GPU backend. Only used when selecting kGpuBackend, else will be ignored.
Warning
setGpuOptions() is deprecated, and will be removed in the future release. Instead, please use below individual setters for each options.
-
void setGpuPerformanceMode(GpuPerformanceMode mode)¶
Set GPU Performance mode.
-
void setGpuPrecision(GpuPrecision precision)¶
Set GPU Precision.
-
void setHtpOptions(HtpPerformanceMode mode, HtpPrecision precision, HtpPdSession pdSession, HtpOptimizationStrategy optimizationStrategy)¶
Optional backend specific options for the HTP backend. Only used when selecting HTP_BACKEND, else will be ignored.
Warning
setHtpOptions() is deprecated, and will be removed in the future release. Instead, please use below individual setters for each options.
-
void setHtpPerformanceMode(HtpPerformanceMode mode)¶
Set HTP Performance mode.
-
boolean enableHtpQuickResponse()¶
This option enhances the HTP response time in performance_mode == burst. However, it requires more power wattage than the usual burst mode. HTP quick response mode will be enabled if at least one delegate is in burst mode and enableHtpQuickResponse() is called. Return value can be used to check if HTP quick response is enabled successfully.
-
void setHtpPrecision(HtpPrecision precision)¶
Set HTP Precision.
-
void setHtpPdSession(HtpPdSession pdSession)¶
Set HTP PdSession.
-
void setHtpOptimizationStrategy(HtpOptimizationStrategy optimizationStrategy)¶
Set HTP Optimization Strategy.
-
void setHtpPerfCtrlStrategy(HtpPerfCtrlStrategy perfCtrl)¶
Set HTP Performance control strategy.
-
void setDspOptions(DspPerformanceMode mode, DspPdSession pdSession)¶
Optional backend specific options for the DSP backend. Only used when selecting DSP_BACKEND, else will be ignored.
Warning
setDspOptions() is deprecated, and will be removed in the future release. Instead, please use below individual setters for each options.
-
void setDspPerformanceMode(DspPerformanceMode mode)¶
Set DSP Performance mode.
-
void setDspPdSession(DspPdSession pdSession)¶
Set DSP PdSession.
-
void setDspPerfCtrlStrategy(DspPerfCtrlStrategy perfCtrl)¶
Set DSP Performance control strategy.
-
void setProfiling(ProfilingOptions input)¶
Option to enable profiling with the delegate. Default is off.
- public void setOpPackageOptions(OpPackageInfo[] info)
Optional structure to specify custom op packages for the backend.
-
void setCacheDir(String input)¶
Specifies the directory of a compiled model. Signals intent to either:
Save the model if the file doesn’t exist, or
Restore model from the file.
Model Cache specific options. Only used when setting
setModelToken, else will be ignored.
-
void setModelToken(String input)¶
The unique nul-terminated token string that acts as a ‘namespace’ for all serialization entries. Should be unique to a particular model (graph & constants).
Model Cache specific options. Only used when setting
setCacheDir, else will be ignored.
- public void setSkipNodeIds(int[] ids)
Set nodes not to be delegated by their node ids. Node ids can be obtained from the nodes’ location information in .tflite files.
- public void setSkipOps(int[] ops)
Set ops not to be delegated by their built-in id(s). Please refer to tensorflow/lite/builtin_ops.h for the op built-in ids. Note that we skip ALL same type of ops specified. For example, if you skip SquaredDifference in your model, all of the SquaredDifference ops in the model will not be delegated.
- public byte[] getProfilingResult()
Get the profiling result, after profiling level is set. Note that the ownership of the returned byte array is transferred to user. Empty result will be returned from the second time on if this function is called consecutively.
-
Options()¶
-
long getNativeHandle()¶