Hexagon NPU Runtime Driver (Windows Only)¶
Hexagon NPU Runtime Driver (HNRD) is available for Windows based Snapdragon® X Series Platforms. HNRD is designed to be forward and backward compatible with Qualcomm AI Stack SDKs. With HNRD in the system, applications have the option to unbundle from SNPE HTP platform dependent libraries, and allows these applications to be portable over to older and newer Windows platforms. HNRD is packaged and distributed independently from the SNPE SDK and it is currently packaged with the device BSP. OEMs use the BSP to pre-install HNRD on their devices.
Switching Between Traditional and HNRD Paths
- Pre-driver:
Traditional path only; applications build with the SNPE SDK and bundle the SNPE HTP platform dependent libraries
- Post-driver:
Traditional and HNRD paths available; the application can choose which path to use
Traditional path; applications build with the SNPE SDK and bundle the SNPE HTP platform dependent libraries with the application (Default – same as pre-driver)
HNRD path; applications build with the SNPE SDK, but utilize the platform dependent libraries installed on the system
Note: there is no difference in the build step between traditional and HNRD paths. In addition, the bundling and choosing between traditional and HNRD paths is applicable to both QNN and SNPE
In other words, if an application bundles the SNPE HTP platform dependent libraries (i.e., SnpeHtpV73Stub.dll and SnpeHtpV73Skel.so), it will default to choose the traditional path. Otherwise, if the platform dependent libraries are not bundled, it will fall back to choose the HNRD path.
The following sample logs illustrate the HNRD path being applied:
1 0.0ms [WARNING] QnnDsp <W> Traditional path not available. Switching to user driver path
2 0.0ms [WARNING] QnnDsp <W> HTP user driver is loaded. Switched to user driver path
Compatibility Support
The minimum QNN and SNPE version is 2.22.2. Applications can be built with older or newer versions of the SNPE SDK and they will still work on the device. Depending on the HNRD version installed on the system, new features may not be supported.
Note
Traditional and HNRD paths can co-exist on the platform and each application independently selects whether to use traditional or HNRD paths.
Model Cache Management
When utilizing Init Caching with HNRD, it requires special attention on managing the caches. When using HNRD path, online prepare and cache loading are done by HNRD since they are platform-dependent. A cache generated by one version of HNRD might not be able to run on HNRD with an older version or might not utilize all software / hardware capabilities on HNRD with a new version. It is required to check the compatibility of saved caches with the HNRD every time before loading, since the HNRD installed on a device can be upgraded (or downgraded) at any time.
Snpe_SNPEBuilder_Build() and SNPEBuilder::build() check the compatibility automatically before loading the caches. Calling Snpe_SNPEBuilder_SetCacheCompatiblityMode() and SNPEBuilder::setCacheCompatibilityMode() can control whether to fail sub-optimal caches during compatibility check. Snpe_SNPEBuilder_ValidateCache() and SNPEBuilder::validateCache() do a similar check as Snpe_SNPEBuilder_Build() and SNPEBuilder::build() except they won’t create SNPE instances.
If a cache fails to pass the compatibility check, performing online prepare to create another valid cache can solve the problem. To reduce the latency impact of online prepare, continue execution with the original sub-optimal cache while doing online prepare in a background thread and switch to the new cache once online prepare is done.
Note that online prepare may not succeed. This can happen when a model converted by a newer SDK version uses features that are not supported by the HNRD. In such a case, upgrade the HNRD. The following example shows how to handle cache management.
1Snpe_SNPE_Handle_t snpeHandle; // SNPE used to do inference
2std::future<Snpe_SNPEBuilder_Handle_t>
3 futureSnpeHandle; // SNPE being created in background
4
5Snpe_DlContainer_Handle_t containerHandle =
6 Snpe_DlContainer_Open(dlcPath.string().c_str());
7Snpe_RuntimeList_Handle_t runtimeListHandle = Snpe_RuntimeList_Create();
8Snpe_SNPEBuilder_Handle_t snpeBuilderHandle = Snpe_SNPEBuilder_Create(containerHandle);
9
10// Add DSP runtime.
11Snpe_RuntimeList_Add(runtimeListHandle, SNPE_RUNTIME_DSP);
12Snpe_SNPEBuilder_SetRuntimeProcessorOrder(snpeBuilderHandle, runtimeListHandle);
13
14// Set compatbility mode to strict to check sub-optimality.
15Snpe_SNPEBuilder_SetCacheCompatibilityMode(snpeBuilderHandle,
16 SNPE_CACHE_COMPATIBILITY_STRICT);
17
18// Create SNPE.
19snpeHandle = Snpe_SNPEBuilder_Build(snpeBuilderHandle);
20
21if (snpeHandle == NULL) {
22 Snpe_ErrorCode_t error = Snpe_ErrorCode_getLastErrorCode();
23 if (error == SNPE_ERRORCODE_DLCACHING_SUBOPTIMAL_CACHE) {
24 // The cache is valid but sub-optimal.
25 // Continue exection with this cache.
26 // Set compatiblity mode to permissive to bypass sub-optimality check.
27 Snpe_SNPEBuilder_SetCacheCompatibilityMode(snpeBuilderHandle,
28 SNPE_CACHE_COMPATIBILITY_PERMISSIVE);
29 snpeHandle = Snpe_SNPEBuilder_Build(snpeBuilderHandle);
30
31 // Create another SNPE with better performance in background thread.
32 // Set compatibility mode to generate new cache.
33 Snpe_SNPEBuilder_SetCacheCompatibilityMode(
34 snpeBuilderHandle, SNPE_CACHE_COMPATIBILITY_ALWAYS_GENERATE_NEW_CACHE);
35 // Enable init cache mode to save the cache later.
36 Snpe_SNPEBuilder_SetInitCacheMode(snpeBuilderHandle, 1);
37 futureSnpeHandle =
38 std::async(std::launch::async, Snpe_SNPEBuilder_Build, snpeBuilderHandle);
39 } else if (error == SNPE_ERRORCODE_QNN_CONTEXT_ERROR_CREATE_FROM_BINARY ||
40 error == SNPE_ERRORCODE_QNN_COMMON_ERROR_NOT_SUPPORTED) {
41 // The cache cannot run.
42 // Force generate a new cache.
43 Snpe_SNPEBuilder_SetCacheCompatibilityMode(
44 snpeBuilderHandle, SNPE_CACHE_COMPATIBILITY_ALWAYS_GENERATE_NEW_CACHE);
45 snpeHandle = Snpe_SNPEBuilder_Build(snpeBuilderHandle);
46
47 if (snpeHandle == NULL) {
48 // If it still fails, one possible reason could be HNRD is too old for the
49 // graph. In such case, prompt the users to upgrade HNRD.
50 cleanUp(containerHandle, runtimeListHandle, snpeBuilderHandle);
51 message(ERROR, "The HNRD is too old. Please install latest HNRD.");
52 }
53 } else {
54 cleanUp(containerHandle, runtimeListHandle, snpeBuilderHandle);
55 throw std::runtime_error("Skip handling of other errors.");
56 }
57}
58
59// Do inference.
60while (waitInputData()) {
61 // Switch to new SNPE if prepare is done.
62 if (futureSnpeHandle.valid() &&
63 futureSnpeHandle.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
64 // Clean up the original SNPE.
65 Snpe_SNPE_Delete(snpeHandle);
66 // Switch to the new SNPE.
67 snpeHandle = futureSnpeHandle.get();
68 // Save the new cache to DLC.
69 Snpe_DlContainer_Save(containerHandle, dlcPath.string().c_str());
70 }
71
72 doInference(snpeHandle);
73}
74
75// Clean up.
76Snpe_SNPE_Delete(snpeHandle);
77cleanUp(containerHandle, runtimeListHandle, snpeBuilderHandle);