Defining a UDO

A User-Defined Operation (UDO) allows users to integrate their custom operations with Qualcomm® Neural Processing SDK to enable execution on any supported hardware accelerator. The UDO mechanism accepts a specification of a custom operation (defined below), and processes that information to process a model containing that custom operation. This section explains how such a UDO can be specified. See Overview of UDO for more details about UDO and Preparing a model with UDO for details on how to convert a model that contains a UDO into a Qualcomm® Neural Processing SDK DLC for supported frameworks.

The UDO Configuration Specification

As described in the Overview of UDO section, a user can express the attributes of their custom operation with a configuration specification file. This UDO configuration (henceforth known as UDO config) is a description of the operation that can be created using the Extensible Markup Language (XML) (henceforth known as XML OpDef Config) or the Javascript Object Notation (JSON) syntax and formatting (henceforth known as JSON UDO Config). The configuration file syntax defines fields that describe the UDO Operation information. The fields are pre-determined and will ultimately be parsed into the required information that constitutes a UDO. The information provided should be generic and independent of a particular model, meaning model-specific parameters or names need not be part of the configuration. The information will be used to identify the op within a framework model, and then ultimately serialized into the DLC model. This implies that any changes in the config would require re-generation of the DLC model to ensure the correct information is serialized. The following sections describe the configuration file specification.

The XML OpDef Config Description

The XML OpDef Config describes the operations the package contains as well as the package information such as the package name, version, and domain. Package information and operations are described with respect to a predefined XML schema (described below), that requires information about operation inputs, outputs, and parameters.

The XML OpDef Schema Breakdown

This section provides an overview of the schema used to define Operation Definitions (Op Defs) in the XML OpDef Config. Op Defs specify the inputs, outputs, parameters, and descriptive metadata that constitute an operation. The schema is formalized using Extensible Markup Language (XML) and XML Schema Definition (XSD).

Operation Definition Schema

The following diagram describes the relationship between these Op Def entities.

../images/OpDefUml.png

In the above OpDef Schema diagram members prefixed by @ are XML attributes and those with no prefix are XML elements.

OpDef

An OpDef is an XML element that describes an operation at thehighest level. It contains the following elements. Elements with content are required and elements that are empty are optional.

<OpDef>
     <Name>OpName</Name>

     <Description>
         <Content></Content>
         <Code></Code>
     </Description>

     <Reference Source="" Url=""></Reference>

     <!--Requires at least one input-->
     <Input>
         <Name>in[0]</Name>
         <Mandatory>true</Mandatory>
         <Constraint id="" Type=""></Constraint>
         <Datatype>FLOAT_32</Datatype>
         <Shape>
             <Rank>1D</Rank>
             <Layout></Layout>
             <Text></Text>
         </Shape>
         <Default></Default>
         <Repeated></Repeated>
         <IsStaticTensor></IsStaticTensor>
     </Input>

     <!--Requires at least one output-->
     <Output>
         <Name>out[0]</Name>
         <Mandatory>true</Mandatory>
         <Constraint id="" Type=""></Constraint>
         <Datatype>FLOAT_32</Datatype>
         <Shape>
             <Rank>1D</Rank>
             <Layout></Layout>
             <Text></Text>
         </Shape>
         <Repeated></Repeated>
     </Output>

     <!--Parameters are optional-->
     <Parameter>
         <Name>param</Name>
         <Mandatory>false</Mandatory>
         <Constraint id="" Type=""></Constraint>
         <Datatype>INT_32</Datatype>
         <Shape>
             <Rank>1D</Rank>
             <Layout></Layout>
             <Text></Text>
         </Shape>
         <Default></Default>
         <Enumeration>
             <Enum></Enum>
         </Enumeration>
     </Parameter>

     <UseDefaultTranslation></UseDefaultTranslation>
     <SupportedBackend>DSP_V68</SupportedBackend>
 </OpDef>
  • Name: The name of the operation.

  • Description: Optional; describes the operation through sequences of content and code.

    • Content: String describing the operation.

    • Code: String that represents code describing the operation, e.g., output_height = input_height - crop_top - crop_bottom

  • Reference: Optional; defines one or more references for the operation.

    • Source: Attribute for the source of operation. E.g. Tensorflow, ONNX, etc.

    • Url: Attribute for the URL of the source.

  • Input: Defines one or more inputs to the operation. Inputs are extensions of Tensors that have the following additional field(s):.

    • IsStaticTensor: Optional boolean flag that if set to True, indicates that an input tensor is a parameter which contains or references static data. If unset, the tensor is treated as a dynamic input.

    • Repeated: Optional boolean that specifies whether this input is repeated. Used for operations which have variadic inputs, such as Concat. 7

  • Output: Defines one or more outputs to the operation. Outputs are extensions of Tensors that have the following additional field(s):

    • Repeated: Optional boolean that specifies whether this output is repeated. Used for operations which have variadic outputs, such as MultiClassNms. 7

  • Parameter: Optional; defines one or more parameters for the operation. Parameters are extensions of Tensors that additionally define.

    • Enumeration: Optional field for enumerated params. Enumerations are composed of subfields called Enum whose content gives the name of the enum representing a given value. Values are assigned in the order in which the Enum is specified.

  • SupportedBackend: Field(s) that specify one or more backends on which this operation is supported. Used when backends share a common definition of an operation. If fields vary across backend for the same operation use SupplementalOpDef and mark the field with BACKEND_SPECIFIC.

  • UseDefaultTranslation: Boolean field that if set to true, indicates that a custom operation overrides a QNN native operation. The custom operation type must match the type of the QNN native operation for accurate conversion. When set to false, a custom operation is converted as a generic user defined operation. In the false mode, the custom op type must match the source framework type.

A key component of creating an OpDef are Inputs, Outputs, and Parameters, all of which are extensions of Tensors. A Tensor element is defined as follows:

  • Name: The name of the tensor.

  • Description: Optional: describes the operation through sequences of content and code:

    • Content: String describing the operation.

    • Code: String that represents code describing the tensor. e.g. output_height = input_height - crop_top - crop_bottom

  • Constraint: Optional: defines one or more constraints on the given tensor. The constraint is given as a string in the body of the element. 8

    • id: Specifies the ID of the constraint. Used to override constraints for supplemental operation definitions.

    • Type: The type of the constraint. Valid types are:

      • Number: a constraint characterized by cardinality, e.g., Number of inputs >= 1

      • Shape: a constraint characterized by a restriction on dimension of the tensor, e.g., Rank >= 1

      • Value: a constraint characterized by the value of the tensor, e.g., Tensor is only positive.

      • Datatype: a constraint characterized by a restriction on datatype. Typically used for tensors which can be of several datatypes but must be of the same datatype as another tensor.

      • Description: a constraint which does not conform to any other category of constraint.

  • Mandatory: Boolean; indicates if the tensor must be provided/defined. 9

  • Datatype: Defines the allowable datatypes for the tensor. Must be one of:

    • FLOAT_16

    • FLOAT_32

    • FIXED_4

    • FIXED_8

    • FIXED_16

    • UINT_8

    • UINT_16

    • UINT_32

    • STRING

    • BACKEND_SPECIFIC used to indicate that the datatype is dependent on the Backend. Must be used in conjunction with a SupplementalOpDef to specify a concrete datatype.

  • Shape: Specifies the shape of the tensor.

    • Rank: The rank of the tensor as an enumeration with the following values:

      • SCALAR: Scalar

      • 1D: Vector

      • 2D: Matrix

      • 3D: 3D Tensor or Image

      • 4D: 4D Tensor or Batched Image

      • ND: Generic N-D Tensor N >= 0

    • Layout: Optional; specifies the layout of the tensor. Must be one of:

      • NHWC

      • NHCW

      • UNDEFINED used to identify that the layout is neither NHCW or NHWC

      • BACKEND_SPECIFIC used to indicate that the layout is dependent on the Backend. Must be used in conjunction with a SupplementalOpDef to specify a concrete layout.

    • Text: Optional; string description for the shape of the tensor.

  • Default: Optional; string representing the default value for the tensor. Can be one of

    • Tensor: use braces or brackets to create list e.g. [[1, 2], [3, 4]]

    • Scalar: provide scalar value e.g. 1, 1.1, -1

    • Boolean: provide either 0 (false) or 1 (true)

    • String: any other string. If text cannot be resolved into one of above categories it will be stored as string.

SupplementalOpDef

SupplementalOpDef’s are XML elements which define content that is variable across backend(s). SupplementalOpDef’s extend the content defined in OpDefs, but limits the fields that can be overridden. The SupplementalOpDef is structured as follows. Elements with content are required, and elements that are empty are optional.

<SupplementalOpDef>

      <Name>OpName</Name>

     <!--Only supplemented Inputs are required-->
      <Input>
          <Name>in[0]</Name>
          <Constraint id="" Type=""></Constraint>
          <Datatype></Datatype>
          <Shape>
              <Layout></Layout>
              <Text></Text>
          </Shape>
          <OnlyDefaultSupported></OnlyDefaultSupported>
      </Input>

       <!--Only supplemented Outputs are required-->
      <Output>
          <Name>out[0]</Name>
          <Constraint id="" Type=""></Constraint>
          <Datatype></Datatype>
          <Shape>
              <Layout></Layout>
              <Text></Text>
          </Shape>
          <OnlyDefaultSupported></OnlyDefaultSupported>
      </Output>

       <!--Only supplemented Params are required-->
      <Parameter>
          <Name>param</Name>
          <Constraint id="" Type=""></Constraint>
          <Datatype></Datatype>
          <Shape>
              <Layout></Layout>
              <Text></Text>
          </Shape>
          <OnlyDefaultSupported></OnlyDefaultSupported>
      </Parameter>

</SupplementalOpDef>
  • Name: The name of the operation.

  • Input: Optional; extends one or more inputs to the operation. Supplemental inputs are Supplemental Tensors.

  • Output: Optional; extends one or more outputs to the operation. Supplemental outputs are Supplemental Tensors.

  • Parameter: Optional; extends one or more parameters for the operation. Supplemental parameters are Supplemental Tensors.

Inputs, outputs, and parameters are all Supplemental Tensors, which can only specify certain fields. All fields, except name, in a supplemental tensor are optional.

  • Name: The name of the tensor. Must correspond to the name of the tensor in the original OpDef that is being extended.

  • Constraint: Optional; defines one or more constraints on the given tensor. The constraint is given as a string in the body of the element. 8

    • id: Specifies the ID of the constraint. Used to override constraints for supplemental operation definitions.

    • Type: The type of the constraint. Valid types are:

      • Number: a constraint characterized by cardinality. e.g. Number of inputs >= 1

      • Shape: a constraint characterized by a restriction on dimension of the tensor. e.g. Rank >= 1

      • Value: a constraint characterized by the value of the tensor. e.g. Tensor is only positive.

      • Datatype: a constraint characterized by a restriction on datatype. Typically used for tensors which can be of several datatypes but must be of the same datatype as another tensor.

      • Description: a constraint which does not conform to any other category of constraint.

  • Datatype: Defines the allowable datatypes for the tensor. Must be one of:

    • FLOAT_16

    • FLOAT_32

    • FIXED_4

    • FIXED_8

    • FIXED_16

    • UINT_8

    • UINT_16

    • UINT_32

    • STRING

  • Shape: Specifies the shape of the tensor.

    • Layout: Optional; specifies the layout of the tensor. Must be one of:

      • NHWC

      • NHCW

      • UNDEFINED used to identify that the layout is neither NHCW or NHWC

      • BACKEND_SPECIFIC used to indicate that the layout is dependent on the Backend. Must be used in conjunction with a SupplementalOpDef to specify a concrete layout.

    • Text: Optional; string description for the shape of the tensor.

  • OnlyDefaultSupported: Optional; boolean that indicates if the backend only supports the default value defined in the corresponding OpDef for this tensor.

OpDefList

An OpDefList is an XML element composed of a sequence of OpDef elements. OpDefLists are backend agnostic and only serve as a wrapper around multiple OpDefs.

<OpDefList>

    <!--One or more OpDef-->
    <OpDef>
      <!--OpDef defined above -->
    </OpDef>

</OpDefList>

SupplementalOpDefList

A SupplementalOpDefList is an XML element composed of a sequence of SupplementalOpDef elements. In addition, SupplementalOpDefLists contain the following fields.

<SupplementalOpDefList Backend="HTP">

   <SupportedOps>
      <OpName></OpName>
   </SupportedOps>

   <SupplementalOpDef>
      <!--SupplementalOpDef defined above-->
   </SupplementalOpDef>

</SupplementalOpDefList>
  • Backend: Specifies which backend the SupplementalOpDef’s are supplementing.

  • SupportedOps: Sequence of OpName elements. Each OpName corresponds to an operation defined in the corresponding OpDefList and indicates the backend supports the operation. This information may be redundant with the SupportedBackend field of the OpDef element.

OpDefCollection

The OpDefCollection is the root XML element of the configuration file meant to be used with the snpe-udo-package-generator. It contains all the information needed to specify all of a users packages. The OpDefCollection contains the following

<OpDefCollection
     xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
     xs:noNamespaceSchemaLocation="OpDef.xsd"
     PackageName="ExamplePackage"
     Domain="example"
     Version="1.0"
>

   <!--One OpDefList-->
   <OpDefList>
      <!--OpDefList defined above-->
   </OpDefList>

   <!--SupplementalOpDefLists are not required-->
   <SupplementalOpDefList Backend="HTP">
      <!--SupplementalOpDefList defined above-->
   </SupplementalOpDefList>

</OpDefCollection>
  • PackageName: Specifies the name for the user’s OpPackage. Because packages are per backend, the actual package name will be the value specified here appended with <Backend> e.g. MyPackageNameHtp.

  • Domain: Specifies the domain of the package.

  • Version: Specifies the version of the package.

  • OpDefList: One OpDefList specifying all operations for the package(s).

  • SupplementalOpDefList: Optional; specifies one or more SupplementalOpDefList to specify per-backend information.

One OpDefCollection element can be used to produce multiple per-backend packages.

The JSON UDO Config Description

The details of the aforementioned UDO config file can be described below.

{
    "UdoPackage_0":
    {
        "Operators": [
            {
                "type": "",
                "inputs":[
                    {"name":"", "per_core_data_types":{"CPU":"FLOAT_32", "GPU":"FLOAT_32", "DSP":"UINT_8"},
                    "static": true, "tensor_layout": "NHWC"},
                    {"name":"", "data_type": "FLOAT_32",
                    "static": true, "tensor_layout": "NHWC"},
                ],
                "outputs":[
                    {"name":"", "per_core_data_types":{"CPU":"FLOAT_32", "GPU":"FLOAT_32", "DSP":"UINT_8"}},
                    {"name":"", "data_type": "FLOAT_32"}
                ],
                "scalar_params": [
                    {"name":"scalar_param_1", "data_type": "INT_32"}
                ],
                "tensor_params": [
                    {"name":"tensor_param_1", "data_type": "FLOAT_32", "tensor_layout": "NHWC"},
                ],
                "core_types": ["CPU", "GPU", "DSP"],
                "dsp_arch_types": ["v66", "v68", "v69", "v73"]
            }
        ],
        "UDO_PACKAGE_NAME": "MyCustomUdoPackage"
    }
}

The above description is simply a generic configuration file to aid the definition of the fields that the user can fill. Required fields are provided with a specific value, while optional fields are denoted with empty strings. Note that an optional field only implies that there is either a default value if it is not provided, or that an empty string will be used. The full details of each available field is described hierarchically below:

  • UdoPackage: Every UDO package can be described as “UdoPackage_i” where i indicates the order in which the packages will be generated. The user is also free to use empty strings but the dictionary structure is necessary.1

  • Operators: This is a child node of a particular UdoPackage indicating the number of operators present. 5

    • type: defines the type of the operation.

    • inputs: a list of input tensors to the operation. Each input is a dictionary object. 2

      • name: An optional field that describes the name of the input tensor. Since the name of the input tensor is variable, the user is not required to provide this.

      • per_core_data_type: A dictionary object specifying the data-type of this input tensor in each core. Alternatively, if the user wishes to have the same data-type across all specified cores, then the user can specify the option “data_type” followed by the data-type. The supported data-types are:

        • FLOAT_16

        • FLOAT_32

        • FIXED_4

        • FIXED_8

        • FIXED_16

        • UINT_8

        • UINT_16

        • UINT_32

        • STRING

      • static: A boolean field that is required if the input data is static, i.e data is provided in the model. This field needs to be set if the input tensor will contain data, otherwise the input will be treated dynamically, and the data will not be serialized.

      • tensor_layout: A string field that describes the canonical dimension format of the input tensor. The supported values are: 4

        • NCHW

        • NHWC

    • outputs: A list of output tensors to the operation.2

    • scalar_params: A list of scalar-valued attributes.3

      • name: A required field that describes the name of the scalar parameter.

      • data_type: A required field that describes the data-type supported by this scalar parameter.

    • tensor_params: A list of tensor-valued attributes.2 3

    • core_types: The intended IP cores for this particular operation. The supported core_types:

      • CPU

      • GPU

      • DSP

    • dsp_arch_types: The intended DSP architecture types for DSP core type. The supported dsp_arch_types:

      • v65

      • v66

      • v68

      • v69

      • v73

  • UDO_PACKAGE_NAME: The name of the UDO Package, which can be any valid string.1

Creating a UDO config

The user should aim to fill out the fields described in the config above to adequately describe a UDO. In some cases, the information required in this config could be easily obtained from framework documentation about the operation. However, there may be subtle caveats, therefore the user is encouraged to ensure that inputs, outputs and params are properly categorized and described. A potential caveat is that inputs can be mis-categorized as parameters and vice versa, if the config is written only according to documentation. In this scenario, a useful tip is to visualize the model using an open source tool such as Netron (found here: https://github.com/lutzroeder/netron) to assist with crafting the UDO config correctly.

Once an adequately descriptive config has been created, it can be used as an argument to the framework converters as described in Preparing a model with UDO.

Notes:

  1. More than one UDO package can be defined in a single config file. Users should note that the package name specified here must match the package name used in creating the corresponding package.

  2. Each input, output and tensor parameter is categorized as the same kind of tensor object, meaning that all the fields are shared. The names of inputs and outputs are not required since the config is a generic description of an operation. An operator must have at least one input and output.

  3. In the case of the parameters, the name field is always required.

  4. The tensor layout is a convention to indicate the arrangement of data within the tensor. Therefore a tensor layout of NHWC means that the data is organized in (batch x height x width x channel), where channel is the fastest changing dimension. Note this is the default arrangement for Qualcomm® Neural Processing SDK, and that may have implications on a model containing a UDO if other tensor layouts are selected. Notably, if a tensor layout of NCHW is selected, then the data and/or tensor parameters may need to be reshaped to the Qualcomm® Neural Processing SDK default to maintain dimensional understanding. Should the user encounter this scenario, they may notice the introduction of intermediary permute layers prior to the UDO layer which will ultimately feed the tensors in question. These caveats should be visible as either converter warnings, debug messages or through outputs of the visualization tools described in Tools. For more details on tensor layouts, the user can consult the section: Input Image Formatting of the documentation.

  5. For CPU, GPU and DSP coretypes, there can be an arbitrary number of operators defined per UDO package. However the provided skeleton code is tailored to define only one operation in one package. One subtle distinction is that the generated DSP V65 or DSP V66 implementation source code expects one operation per implementation library. While in the CPU, GPU, and DSP V68 or later cases, there may be an arbitrary number of operations in a library.

  6. The data-type of a tensor determines both how the data contained in the tensor will be stored in the DLC, and the type of memory handed over by Qualcomm® Neural Processing SDK during runtime execution. While tensors get stored within the DLC in the exact data-type specified by the UDO definition, there may be runtime restrictions on the type of memory users can expect to receive depending on the chosen core-type. Users should visit the following section: Compiling a UDO package for more details.

  7. Operations with an unknown number of inputs/outputs are currently not supported on the DSP and HTP backend.

  8. Constraints are purely a descriptive field and do not require mathematical expression. Constraints are currently not enforced.

  9. Outputs are not mandatory, but cannot have default values and are assumed to be NULL if not provided.