Init Caching

Qualcomm® Neural Processing SDK can use the DLC model file to store structures that are normally built in memory during model initialization. When init caching functionality is enabled, such structures will be inserted into a set of init caches, generated during the network initialization process. These will then be added to the DLC container. If such a DLC container is saved by the user, in subsequent network initialization processes the init caches will be loaded from the DLC so as to reduce initialization time.

Starting with a freshly generated DLC, the network needs to be loaded with init caching enabled. Once the SNPE object has been successfully created, the DLC file has to be saved in order for the init caches to be persisted to storage:

#include <SNPE/SNPE.hpp>
#include "DnnSerialization/IDnnSerialization.hpp"

auto container = zdl::DlContainer::IDlContainer::open("model.dlc");

zdl::SNPE::SNPEBuilder builder(container.get());
builder.setInitCacheModee(true);

auto snpe = builder.build();

container->save("model.dlc");

The modified DLC containing the init caches can now be re-opened later using the exact same code sequence. Initialization will be significantly sped up because the caches are loaded instead of all initialization logic being re-executed. Init caching is currently supported only for the DSP and AIP runtimes.

Qualcomm® Neural Processing SDK performs various checks to ensure the cache can be still be safely used, such as comparing library version and certain other options that were passed to the builder. If the cache is considered stale, it will automatically be regenerated. It is therefore a good idea to invoke the “save” method on the container each time when using init caching. If the cache has not been modified, calling the “save” method will not do anything.