C Tutorial - Build the Sample¶
Prerequisites
The Qualcomm® Neural Processing SDK has been set up following the Qualcomm (R) Neural Processing SDK Setup chapter.
The Tutorials Setup has been completed.
Introduction
This tutorial demonstrates how to build a C sample application that can execute neural network models on the PC or target device. Please note, while this sample code does not do any error checking, it is strongly recommended that users check for errors when using the Qualcomm® Neural Processing SDK APIs.
Most applications will follow the following pattern while using a neural network:
The below snippet of code provides an overall idea of how to use Qualcomm(R)| Neural Processing SDK APIs. For working example code, please refer to the collection of sample apps located at $SNPE_ROOT/examples/SNPE/NativeCpp/SampleCode_CAPI/
bool useUserSuppliedBuffers = true;
int inputListNum = 1;
//Create runtime
Snpe_Runtime_t runtime = checkRuntime();
//Add runtime to runtime list
Snpe_RuntimeList_Handle_t runtimeListHandle = Snpe_RuntimeList_Create();
Snpe_RuntimeList_Add(runtimeListHandle, runtime);
//Load dlc file
Snpe_DlContainer_Handle_t containerHandle = loadContainerFromFile(dlc);
//Generate snpe handle from builder options
Snpe_SNPE_Handle_t snpeHandle = setBuilderOptions(containerHandle, runtimeListHandle, useUserSuppliedBuffers);
Snpe_TensorMap_Handle_t inputTensorMapHandle = loadInputTensor(snpeHandle, fileLine); // ITensor
loadInputUserBuffer(applicationInputBuffers, snpeHandle, fileLine); // User Buffer
executeNetwork(snpeHandle , inputTensorMapHandle, OutputDir, inputListNum); // ITensor
executeNetwork(snpeHandle, inputMapHandle, outputMapHandle, applicationOutputBuffers, OutputDir, inputListNum); // User Buffer
The sections below describe how to implement each step described above.
Get Available Runtime
The code excerpt below illustrates how to check if a specific runtime is available using the native APIs (the GPU runtime is used as an example).
Snpe_Runtime_t checkRuntime()
{
Snpe_DlVersion_Handle_t versionHandle = Snpe_Util_GetLibraryVersion();
Snpe_Runtime_t Runtime;
std::cout << "Qualcomm (R) Neural Processing SDK Version: " << Snpe_DlVersion_ToString(versionHandle) << std::endl; //Print Version number
Snpe_DlVersion_Delete(versionHandle);
if (Snpe_Util_IsRuntimeAvailable(SNPE_RUNTIME_GPU)) {
Runtime = SNPE_RUNTIME_GPU;
} else {
Runtime = SNPE_RUNTIME_CPU;
}
return Runtime;
}
Load Network
The code excerpt below illustrates how to load a network from the Qualcomm® Neural Processing SDK container file (DLC).
Snpe_DlContainer_Handle_t loadContainerFromFile(std::string containerPath)
{
Snpe_DlContainer_Handle_t containerHandle = Snpe_DlContainer_Open(containerPath.c_str());
return containerHandle;
}
Set Network Builder Options
The following code demonstrates how to instantiate a SNPE Builder object, which will be used to execute the network with the given parameters.
Snpe_SNPE_Handle_t setBuilderOptions(Snpe_DlContainer_Handle_t containerHandle,
Snpe_RuntimeList_Handle_t runtimeListHandle,
bool useUserSuppliedBuffers)
{
Snpe_SNPE_Handle_t snpeHandle;
Snpe_SNPEBuilder_Handle_t snpeBuilderHandle = Snpe_SNPEBuilder_Create(containerHandle);
Snpe_SNPEBuilder_SetRuntimeProcessorOrder(snpeBuilderHandle, runtimeListHandle)
Snpe_SNPEBuilder_SetUseUserSuppliedBuffers(snpeBuilderHandle, useUserSuppliedBuffers)
snpeHandle = Snpe_SNPEBuilder_Build(snpeBuilderHandle);
return snpeHandle;
}
Load Network Inputs
Network inputs and outputs can be either user-backed buffers or ITensors (built-in Qualcomm® Neural Processing SDK buffers), but not both. The advantage of using user-backed buffers is that it eliminates an extra copy from user buffers to create ITensors. Both methods of loading network inputs are shown below.
Using User Buffers
Qualcomm® Neural Processing SDK can create its network inputs and outputs from user-backed buffers. Note that Qualcomm® Neural Processing SDK expects the values of the buffers to be present and valid during the duration of its execution.
Here is a function for creating a Qualcomm® Neural Processing SDK UserBuffer from a user-backed buffer and storing it in a UserBufferMap. These maps are a convenient collection of all input or output user buffers that can be passed to Qualcomm® Neural Processing SDK to execute the network.
Disclaimer: The strides of the buffer should already be known by the user and should not be calculated as shown below. The calculation shown is solely used for executing the example code.
void createUserBuffer(Snpe_UserBufferMap_Handle_t userBufferMapHandle,
std::unordered_map<std::string, std::vector<uint8_t>>& applicationBuffers,
std::vector<Snpe_IUserBuffer_Handle_t> snpeUserBackedBuffersHandle,
Snpe_SNPE_Handle_t snpeHandle,
const char * name)
{
// get attributes of buffer by name
Snpe_IBufferAttributes_Handle_t bufferAttributesOptHandle = Snpe_SNPE_GetInputOutputBufferAttributes(snpeHandle, name);
if (bufferAttributesOptHandle == nullptr) throw std::runtime_error(std::string("Error obtaining attributes for input tensor ") + name);
// calculate the size of buffer required by the input tensor
Snpe_TensorShape_Handle_t bufferShapeHandle = Snpe_IBufferAttributes_GetDims(bufferAttributesOptHandle);
// Calculate the stride based on buffer strides, assuming tightly packed.
// Note: Strides = Number of bytes to advance to the next element in each dimension.
// For example, if a float tensor of dimension 2x4x3 is tightly packed in a buffer of 96 bytes, then the strides would be (48,12,4)
// Note: Buffer stride is usually known and does not need to be calculated.
std::vector<size_t> strides(Snpe_TensorShape_Rank(bufferShapeHandle));
strides[strides.size() - 1] = sizeof(float);
size_t stride = strides[strides.size() - 1];
for (size_t i = Snpe_TensorShape_Rank(bufferShapeHandle) - 1; i > 0; i--)
{
stride *= Snpe_TensorShape_At(bufferShapeHandle, i);
strides[i-1] = stride;
}
Snpe_TensorShape_Handle_t stridesHandle = Snpe_TensorShape_CreateDimsSize(strides.data(), Snpe_TensorShape_Rank(bufferShapeHandle));
size_t bufferElementSize = Snpe_IBufferAttributes_GetElementSize(bufferAttributesOptHandle);
size_t bufSize = calcSizeFromDims(Snpe_TensorShape_GetDimensions(bufferShapeHandle), Snpe_TensorShape_Rank(bufferShapeHandle), bufferElementSize);
// set the buffer encoding type
Snpe_UserBufferEncoding_Handle_t userBufferEncodingFloatHandle = Snpe_UserBufferEncodingFloat_Create();
// create user-backed storage to load input data onto it
applicationBuffers.emplace(name, std::vector<uint8_t>(bufSize));
// create Qualcomm (R) Neural Processing SDK user buffer from the user-backed buffer
ubsHandle.push_back(Snpe_Util_CreateUserBuffer(applicationBuffers.at(name).data(),
bufSize,
stridesHandle,
userBufferEncodingFloatHandle));
// add the user-backed buffer to the inputMap, which is later on fed to the network for execution
Snpe_UserBufferMap_Add(userBufferMapHandle, name, snpeUserBackedBuffersHandle.back());
}
The following function then shows how to load input data from file(s) to user buffers. Note that the input values are simply loaded onto user-backed buffers, on top of which Qualcomm® Neural Processing SDK can create Qualcomm® Neural Processing SDK UserBuffers, as shown above.
void loadInputUserBuffer(std::unordered_map<std::string, std::vector<uint8_t>>& applicationBuffers,
Snpe_SNPE_Handle_t snpeHandle,
const std::string& fileLine)
{
// get input tensor names of the network that need to be populated
Snpe_StringList_Handle_t inputNamesHandle = Snpe_SNPE_GetInputTensorNames(snpeHandle);
if (inputNamesHandle == nullptr) throw std::runtime_error("Error obtaining input tensor names");
assert(Snpe_StringList_Size(inputNamesHandle) > 0);
// treat each line as a space-separated list of input files
std::vector<std::string> filePaths;
split(filePaths, fileLine, ' ');
if (Snpe_StringList_Size(inputNamesHandle)) std::cout << "Processing DNN Input: " << std::endl;
for (size_t i = 0; i < Snpe_StringList_Size(inputNamesHandle); i++) {
const char* name = Snpe_StringList_At(inputNamesHandle, i);
std::string filePath(filePaths[i]);
// print out which file is being processed
std::cout << "\t" << i + 1 << ") " << filePath << std::endl;
// load file content onto application storage buffer,
// on top of which, Qualcomm (R) Neural Processing SDK has created a user buffer
loadByteDataFile(filePath, applicationBuffers.at(name));
};
}
Using ITensors
Snpe_TensorMap_Handle_t loadInputTensor (Snpe_SNPE_Handle_t snpeHandle, std::string& fileLine)
{
Snpe_ITensor_Handle_t input;
Snpe_StringList_Handle_t strListHandle = Snpe_SNPE_GetInputTensorNames(snpeHandle);
if (strListHandle == nullptr) throw std::runtime_error("Error obtaining Input tensor names");
// Make sure the network requires only a single input
assert (Snpe_StringList_Size(strListHandle) == 1);
// If the network has a single input, each line represents the input file to be loaded for that input
std::string filePath(fileLine);
std::cout << "Processing DNN Input: " << filePath << "\n";
std::vector<float> inputVec = loadFloatDataFile(filePath);
/* Create an input tensor that is correctly sized to hold the input of the network. Dimensions that have no fixed size will be represented with a value of 0. */
auto inputDimsHandle = Snpe_SNPE_GetInputDimensions(snpeHandle, Snpe_StringList_At(strListHandle, 0));
/* Calculate the total number of elements that can be stored in the tensor so that we can check that the input contains the expected number of elements.
With the input dimensions computed create a tensor to convey the input into the network. */
input = Snpe_Util_CreateITensor(inputDimsHandle);
/* Copy the loaded input file contents into the networks input tensor.SNPE's ITensor supports C++ STL functions like std::copy() */
std::copy(inputVec.begin(), inputVec.end(), (float*)Snpe_ITensor_GetData(input));
Snpe_TensorMap_Handle_t inputTensorMapHandle = Snpe_TensorMap_Create();
Snpe_TensorMap_Add(inputTensorMapHandle, Snpe_StringList_At(strListHandle, 0), inputs[i]);
return inputTensorMapHandle;
}
Execute the Network & Process Output
The following snippets of code use the native API to execute the network (in UserBuffer or ITensor mode) and show how to iterate through the newly populated output tensor.
Using User Buffers
void executeNetwork(Snpe_SNPE_Handle_t snpeHandle,
Snpe_UserBufferMap_Handle_t inputMapHandle,
Snpe_UserBufferMap_Handle_t outputMapHandle,
std::unordered_map<std::string,std::vector<uint8_t>>& applicationOutputBuffers,
const std::string& outputDir,
int num)
{
// Execute the network and store the outputs in user buffers specified in outputMap
Snpe_SNPE_ExecuteUserBuffers(snpeHandle, inputMapHandle, outputMapHandle);
// Get all output buffer names from the network
Snpe_StringList_Handle_t outputBufferNamesHandle = Snpe_UserBufferMap_GetUserBufferNames(outputMapHandle);
// Iterate through output buffers and print each output to a raw file
std::for_each(Snpe_StringList_Begin(outputBufferNamesHandle), Snpe_StringList_End(outputBufferNamesHandle), [&](const char* name)
{
std::ostringstream path;
path << outputDir << "/Result_" << num << "/" << name << ".raw";
SaveUserBuffer(path.str(), applicationOutputBuffers.at(name));
});
}
// The following is a partial snippet of the function
void SaveUserBuffer(const std::string& path, const std::vector<uint8_t>& buffer) {
...
std::ofstream os(path, std::ofstream::binary);
if (!os)
{
std::cerr << "Failed to open output file for writing: " << path << "\n";
std::exit(EXIT_FAILURE);
}
if (!os.write((char*)(buffer.data()), buffer.size()))
{
std::cerr << "Failed to write data to: " << path << "\n";
std::exit(EXIT_FAILURE);
}
}
Using ITensors
void executeNetwork(Snpe_SNPE_Handle_t snpeHandle,
Snpe_TensorMap_Handle_t inputTensorMapHandle,
std::string OutputDir,
int num)
{
// Execute the network and store the outputs that were specified when creating the network in a TensorMap
Snpe_TensorMap_Handle_t outputTensorMapHandle = Snpe_TensorMap_Create();
Snpe_SNPE_ExecuteITensors(snpeHandle, inputTensorMapHandle, outputTensorMapHandle);
Snpe_StringList_Handle_t tensorNamesHandle = Snpe_TensorMap_GetTensorNames(outputTensorMapHandle);
// Iterate through the output Tensor map, and print each output layer name
std::for_each( Snpe_StringList_Begin(tensorNamesHandle), Snpe_StringList_End(tensorNamesHandle), [&](const char* name)
{
std::ostringstream path;
path << OutputDir << "/"
<< "Result_" << num << "/"
<< name << ".raw";
auto tensorHandle = Snpe_TensorMap_GetTensor_Ref(outputTensorMapHandle, name);
SaveITensor(path.str(), tensorHandle);
});
// Clean up created handles
Snpe_TensorMap_Delete(outputTensorMapHandle);
Snpe_StringList_Delete(tensorNamesHandle);
}
// The following is a partial snippet of the function
void SaveITensor(const std::string& path, Snpe_ITensor_Handle_t tensorHandle)
{
...
std::ofstream os(path, std::ofstream::binary);
if (!os)
{
std::cerr << "Failed to open output file for writing: " << path << "\n";
std::exit(EXIT_FAILURE);
}
auto begin = static_cast<float*>(Snpe_ITensor_GetData(tensorHandle));
auto size = Snpe_ITensor_GetSize(tensorHandle);
for ( auto it = begin; it != begin + size; ++it )
{
float f = *it;
if (!os.write(reinterpret_cast<char*>(&f), sizeof(float)))
{
std::cerr << "Failed to write data to: " << path << "\n";
std::exit(EXIT_FAILURE);
}
}
}