QNN Operation Definitions

This document defines operations supported by QNN. Per backend specific information is referenced as needed.

ArgbToRgb

Transform ARGB or RGBA to RGB. Refer to input_order and reverse_output below for control of the input/output order.

Inputs

in[0]

Input tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: [b,h,w,4]

Parameters

input_order

Controls the order of the input tensor. If QNN_OP_ARGB_TO_RGB_INPUT_ORDER is QNN_OP_ARGB_TO_RGB_INPUT_ORDER_ARGB the input order is ARGB; if QNN_OP_ARGB_TO_RGB_INPUT_ORDER_RGBA is selected then the input order is RGBA.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • ARGB = 0,

    • RGBA = 1

reverse_output

Controls the order of the output tensor. Set to false the order of the input tensor is maintained in the output tensor. Set to true, the order of the last 3 channels of the input tensor is reversed in the output tensor.

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

Outputs

out[0]

Output tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: [b,h,w,3]

  • Constraints:

    • Datatype: Same datatype as in[0]

Argmax

Returns the index of the largest element along an axis.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: n-dimensional

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

Axis on which to reduce across.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be in range [0, n-1]

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32, backend specific

  • Shape: m-dimensional, where m = n if keep_dims is true and m = n - 1 otherwise.

Argmin

Returns the index of the smallest element along an axis.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: n-dimensional

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

Axis on which to reduce across.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be in range [0, n-1]

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32, backend specific

  • Shape: m-dimensional, where m = n if keep_dims is true and m = n - 1 otherwise.

AxisAlignedBboxTransform

Transform axis-aligned bounding box proposals into refined bounding boxes using bounding box regression deltas for each class.

Notes:

  • Bounding boxes are aligned to the image coordinate system (i.e. not rotated).

  • Axis-aligned bounding boxes are defined by the upper-left corner (x1, y1) and lower-right corner (x2, y2).

  • Valid bounding boxes are such that x1 <= x2 and y1 <= y2.

  • Resulting bounding boxes are clipped against the edges of the image.

  • The number of regions of interest (num_rois) is in the range [0, N] where N is the maximum set by the underlying tensor.

References:

Inputs

in[0]

Bounding box proposal locations

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, 4] each line with format [x1, y1, x2, y2]

in[1]

Bounding box deltas for each region of interest and each class

Bounding box deltas are organized in the following order [dx, dy, dw, dh] where:

  • dx and dy are the relative correction factors for the center position of the bounding box

  • dw and dh are the log-scale relative correction factors for the width and height the bounding box

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, num_classes * 4]

in[2]

Batch index of each bounding box. Boxes with the same batch index are grouped together.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_rois]

in[3]

Specifies image size. Image size is the same for all images in the batch.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batches, 2] with format [image_height, image_width] per batch

Parameters

weights

Weights applied to each of the bounding boxes deltas in in[1].

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: [4] with format [wx, wy, ww, wh]

  • Default: [1.0, 1.0, 1.0, 1.0]

Outputs

out[0]

Coordinates of refined bounding boxes

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, num_classes * 4] with format [x1, y1, x2, y2]

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

batch_splits : Specifies the number of RoIs/boxes belonging to the corresponding image in batch. Note that the sum of values should add up to a total of num_rois.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [batches]

Batchnorm

Normalizes the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.

See Batchnorm backend definition for supported datatypes and constraints per backend.

References:

Inputs

in[0]

Input tensor from the previous operator

  • Mandatory: true

  • Data type: backend specific

  • Shape: n-dimensional, note that the last dimension in the input is the channel, […,channel].

  • Constraints:

    • Shape: Rank > 0

in[1]

Weights

  • Mandatory: true

  • Data type: backend specific

  • Shape: [channel]

in[2]

Biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel]

  • Default: {0,..,0}

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: n-dimensional

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

    • Must have same data format as in[0] (e.g. both sparse or both dense)

BatchPermutation

Generates a batch permutation of the input in[0]. The output out[0] may have the same shape as in[0] with the exception of the batch dimension which is the shape of in[1]. Data is re-ordered according to the indices provided.

Example of batch permutation on a 3-D tensor with batch size 4:

input = [
[[1, 5], [3, 4]],
[[4, 3], [5, 2]],
[[2, 2], [6, 0]],
[[0, 0], [1, 2]]
]

indices = [2, 0, 1, 3]

output = [
[[2, 2], [6, 0]],
[[1, 5], [3, 4]],
[[4, 3], [5, 2]],
[[0, 0], [1, 2]]
]

Inputs

in[0]

Input tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: tensor of rank N where dim[0] equals batch

  • Constraints:

    • Shape: Rank > 0

in[1]

indices : indices of batch to permute. Valid index values should be in range [0, batch - 1], otherwise they are ignored.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M], where M <= batch

  • Constraints:

    • Value: Valid index values must be unique.

Parameters

None

Outputs

out[0]

Output permuted tensor. Note that out[0] is ordered according to the indices provided that are in the range [0, batch - 1] and then padded with zeros.

  • Mandatory: true

  • Data type: backend specific

  • Shape: tensor of rank N where shape is the same as in[0] with the exception of dim[0] which equals M.

BatchToSpace

A type of tensor realignment operation that rearranges data from the batch dimension into blocks of spatial data, followed by cropping.

The op moves blocks of data of size (block_size[0] * block_size[1]) from the batch dimension of the input tensor into the spatial dimensions of the output tensor followed by cropping along the spatial dimensions.

References:

Inputs

in[0]

Input Activation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [batch, height, width, depth]

  • Constraints:

    • Shape: batch must be divisible by (block_size[0] * block_size[1])

Parameters

block_size

Vector that represents block size along the height and width dimensions respectively.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [block_height, block_width]

  • Constraints:

    • Value: Elements must be >=1

crops

Crop region that specifies how many elements to crop from the intermediate result across the spatial dimensions.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[crop_top, crop_bottom], [crop_left, crop_right]]

  • Default: [[0, 0], [0, 0]]

Outputs

out[0]

Output Activation.

Permuted output tensor with new spatial dimensions [output_height, output_width] defined by

output_height = (height * block_size[0]) - crop_top - crop_bottom
output_width = (width * block_size[1]) - crop_left - crop_right
  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [batch / (block_size[0] * block_size[1]), output_height, output_width, depth]

  • Constraints:

    • Datatype: Same datatype as in[0]

BboxTransform

Transform bounding box proposals into refined bounding boxes using bounding box regression deltas for each class.

Notes:

  • Bounding boxes can be rotated.

  • Resulting bounding boxes are clipped against the edges of the image.

  • The number of regions of interest (num_rois) is in the range [0, N] where N is the maximum set by the underlying tensor.

References:

Inputs

in[0]

Bounding box proposal locations.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, 5] each line with format [center_x, center_y, width, height, angle]

in[1]

Bounding box deltas for each region of interest and each class.

Bounding box deltas are organized in the following format [dx, dy, dw, dh, da].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, num_classes * 5]

in[2]

Specifies image size. Image size is same for all images in a batch.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batches, 3] with format [image_height, image_width, image_scale]

in[3]

Batch index of each bounding box.

Batches with the same index are grouped together.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_rois]

  • Default: {0,..,0}

Parameters

weights

Weights applied to each of the bounding boxes deltas in in[1] in the form of (wx, wy, ww, wh).

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: [4]

apply_scale

Set QNN_OP_BBOX_TRANSFORM_APPLY_SCALE to true to transform the boxes to the scaled image space after applying the bounding box deltas, or QNN_OP_BBOX_TRANSFORM_APPLY_SCALE to false not apply scale.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

angle_bounds

Limits the bounding box angle to be within the range [angle_bound_low, angle_bound_high].

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [2] with format [angle_bound_low, angle_bound_high]

  • Default: parameter not used unless set

angle_clip_threshold

Implements:

angle = (angle < max(angle_clip_threshold, 0.0)) ? (0.0) : (angle)

Set to negative to disable.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: -1.0

Outputs

out[0]

Coordinates of refined bounding boxes.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, num_classes * 5]

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: same as in[1]

out[1]

Number of RoIs per batch.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batches]

BoxWithNmsLimit

Greedily selects a subset of bounding boxes in descending order of score.

This op applies NMS algorithm to each class. In each loop of execution, the box with maximum score is selected and removed from the pending set. The scores of the rest of boxes are lowered according to the intersection-over-union (IOU) overlapping with the previously selected boxes and a specified NMS kernel method. Any boxes with score less than a threshold are removed from the pending set.

Three NMS kernels are supported:

Hard: score_new = score_old * (1 if IoU < threshold else 0)
Linear: score_new = score_old * (1 if IoU < threshold else 1 - IoU)
Gaussian: score_new = score_old * exp(- IoU^2 / sigma)

Axis-aligned bounding boxes are represented by its upper-left corner coordinate (x1,y1) and lower-right corner coordinate (x2,y2). A valid bounding box should satisfy x1 <= x2 and y1 <= y2.

References:

Inputs

in[0]

Bounding box proposals. Elements can be understood as 4-tuples of bounding box coordinates given in the form (x1,y1,x2,y2). Boxes pertaining to a given batch element are grouped consecutively.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, num_classes*4]

in[1]

Bounding box scores. The element at position [roi, class] can be understood as the score for the bounding box at the same position in in[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, num_classes]

in[2]

Bounding box batch indices. Specifies the batch index of each box. Boxes pertaining to a given batch element are grouped consecutively.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_rois]

in[3]

batch splits : Specifies the number of RoIs/boxes belonging to the corresponding image in batch. Note that the sum of values should add up to a total of num_rois.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [batch]

Parameters

nms_kernel_method

Determines the NMS kernel method, options are 0:hard, 1:linear, 2:gaussian.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Values:

    • hard = 0,

    • linear = 1,

    • gaussian = 2

nms_score_threshold

Boxes with scores lower than the threshold are dropped during the score updating phase in soft NMS.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

score_threshold

Boxes with scores lower than the threshold are filtered before sending to the NMS algorithm.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

pre_nms_limit

Specifies a maximum number of boxes for each image which will be sent to NMS. Set to a negative value for unlimited number of output bounding boxes.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Default: -1

iou_threshold

Specifies the IoU threshold in hard and linear NMS kernel. This parameter is ignored if gaussian kernel is selected.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

sigma

Specifies the sigma value in gaussian NMS kernel. This parameter is ignored if the gaussian kernel is not selected.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

Outputs

out[0]

Output boxes. Each element can be understood as a 4-tuple with the same meaning as in[0]. Boxes are grouped by batch, but order within each batch is not guaranteed.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_output_rois, 4]

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

Output box scores. Gives the score for the box in the corresponding position in out[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_output_rois]

  • Constraints:

    • Datatype: Same datatype as in[1]

out[2]

Output box classes. Gives the class index (with respect to num_classes in in[0]) with the maximum score for the box in the corresponding position in out[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_output_rois]

out[3]

Output box batch indices : Gives the batch index for the box in the corresponding position in out[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_output_rois]

out[4]

Output batch splits : Specifies the number of RoIs/boxes belonging to the corresponding image in batch after applying NMS. Note that the sum of values should add up to a total of num_output_rois.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [batch]

out[5]

keeps : contains the indices of the selected boxes after performing NMS. The values of the indices are in the order of the boxes in out[0] and correspond to index position of the selected boxes in in[0].

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_output_rois]

out[6]

keeps size : contains the number of selected boxes per class after NMS has been applied.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_classes]

Buffer

Accumulates input frames across inferences into a buffer of size buffer_size and outputs the buffer with the collected frames. When the buffer is full the oldest existing frames in the buffer are removed to make space for the incoming new frame. The number of frames to remove is determined by stride. The remaining frames are shifted in the buffer to maintain the order they were received.

Inputs

in[0]

input activation. Note that Shape(in[0])[buffer_dim] determines the number of frames in the input.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

reset : Determines if the buffer should be reset. When set to true all frames in the buffer are removed.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: 0D containing scalar value

  • Default: 0

Parameters

buffer_size

Determines the number of frames that a buffer can store.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be evenly divisible by Shape(in[0])[buffer_dim].

buffer_dim

Determines the dimension that frames are accumulated on.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be in range [0, N-1]

stride

Determines the number of frames to remove from the buffer when the buffer is full to make space for the new incoming frames. The oldest existing frames which reside at the beginning of the buffer are removed. After removal the remaining frames are kept in the order they were received.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

  • Constraints:

    • Value: Must be in range [Shape(in[0])[buffer_dim], buffer_size].

    • Value: Must be evenly divisible by Shape(in[0])[buffer_dim].

mode

Determines blocking behavior. How the buffer is populated differs between the modes when the buffer is not full. When the buffer is full eviction and population behavior is the same for all modes. BLOCKING :0, NON_BLOCKING_LEFT: 1, NON_BLOCKING_RIGHT: 2.

When mode is set to BLOCKING: Execution is stopped on the existing branch of the graph if the buffer is not full. The buffer is populated from the beginning to the end. For example, an empty buffer with 3 slots (0,1,2) will be populated from slot 0 to slot 2.

When mode is set to NON_BLOCKING_LEFT: The existing branch of the graph will always execute regardless if the buffer is full or not. The buffer is populated from the beginning to the end. For example, an empty buffer with 3 slots (0,1,2) will be populated from slot 0 to slot 2.

When mode is set to NON_BLOCKING_RIGHT: The existing branch of the graph will always execute regardless if the buffer is full or not. The buffer is populated from the end. For example, an empty buffer with 3 slots (0,1,2) the first incoming frame is placed at slot 2. For the next incoming frame, the previous frame at slot 2 is now at slot 1 and the new frame is placed at slot 2.

When the buffer is full the number of frames removed is determined by stride. Eviction behavior is the same for all modes where the oldest existing frames are removed from the beginning of the buffer. Population behavior is also the same for all modes when the buffer is full. For example, a fully populated buffer with 3 slots (0,1,2) and a stride value of 2 will have the frames at slot 0 and slot 1 removed and the frame at slot 2 will now be at slot 0. The incoming frame is then placed at slot 1. The next incoming frame will then be placed at slot 2. When the buffer is full again the same process is repeated.

Note that NON_BLOCKING_LEFT and NON_BLOCKING_RIGHT will be zero filled for the output if the buffer is not completely full.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • BLOCKING = 0,

    • NON_BLOCKING_LEFT = 1,

    • NON_BLOCKING_RIGHT = 2

buffer_padding

Determines the number of frames to pad with 0’s to the buffer initially or after reset.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Same shape as in[0] except where dim[buffer_dim] is equal to buffer_size.

  • Constraints:

    • Datatype: Same Datatype as in[0]

    • Datatype: Same Rank as in[0]

Cast

Casts tensor data type to a new data type. This operation ignores quantization parameters specified with Qnn_QuantizeParams_t for tensors of fixed point data types, e.g. it treats a QNN_DATATYPE_UFIXED_POINT_8 tensor data type as a tensor of QNN_DATATYPE_UINT_8 data type.

Refer to Cast backend definition for support of dynamic dimensions for each backend. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

ChannelShuffle

This operation shuffles the channels of the input tensor, by dividing the channel dimension into num_groups groups, and reorganizing the channels by grouping channels with the same index in each group.

Along the channel dimension, the output is calculated using this formula:

output_channel[i * num_groups + g] = input_channel[g * group_size + i]

where

num_channels = shape(in[0])[channel]
group_size = num_channels / num_groups
g is a group index : [0, num_groups-1]
i is index within the group : [0, group_size-1]

The num_channels must be evenly divisible by num_groups. num_groups = num_channels results in no-op.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

num_groups

Number of groups to divide channel dimension into, <= num_channels.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

axis

Axis on which channel shuffle will be performed

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: N-1

  • Constraints:

    • Value: Must be in range [0,N-1]

Outputs

out[0]

output tensor with shuffled channels

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same rank as in[0]

Col2Im

Rearranges batched columns, each representing a snapshot over a kernel of an image, back into an image. The overlapping values in the resulting image are summed.

References:

Inputs

in[0]

Input tensor of batched columns

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, channel * height_kernel * width_kernel, L], where L is the number of columns

  • Constraints:

    • Shape: The value of L must satisfy \(\prod_i(\text{floor}(\frac{spatial\_dimensions[i] + 2 * padding[i]- dilation[i] * (kernel\_size[i] -1) - 1}{stride[i]} + 1))\), with spatial_dimensions being the height and width of the output image.

Parameters

kernel_size

The size of the sliding block for retrieving the image.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_kernel, width_kernel]

stride

Defines stride for 2D spatial (i.e. height and width) axes of the output image.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Paddings along the height and width dimensions of the output image.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[pad_top, pad_bottom], [pad_left, pad_right]]

dilation

Dilation value along each spatial axis (i.e. height and width) of the kernel for retrieving the image.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_dilation, width_dilation]

  • Default: [1, 1]

  • Constraints:

    • Value: Dilations must be > 0

Outputs

out[0]

Output tensor describing the shape of the resulting image

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

CollectRpnProposals

Collect RoIs and their scores, merge predictions across multiple FPN levels, and retain the top scoring RoIs. Note that RoI elements can be understood as 5-tuples with format (batch_idx, x1, y1, x2, y2) where batch_idx specifies the batch index of each RoI, (x1,y1) represents the upper-left corner coordinate, and (x2,y2) represents the lower-right corner coordinate.

Inputs

in[0]

RoIs : RPN proposals for FPN level rpm_min_level.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, 5] each line with format [batch_idx, x1, y1, x2, y2]

in[1..4]

RoIs : RPN proposals for FPN levels [rpn_min_level + 1, rpn_max_level]. Note that each input can have a different shape since num_roi_i can vary for in[1..4].

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: false

  • Data type: backend specific

  • Shape: [num_rois_i, 5] each line with format [batch_idx, x1, y1, x2, y2]

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Number: Number of in[1..4] provided must equal rpn_max_level - rpn_min_level

in[5]

RoI probabilities : RPN scores for FPN level rpn_min_level.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois]

in[6..9]

RoI probabilities : RPN scores for FPN levels [rpn_min_level + 1, rpn_max_level]. Note that each input can have a different shape since num_roi_i can vary for in[6..9] but must be the same as the corresponding RoIs in[1..4] provided for the same FPN level.

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: false

  • Data type: backend specific

  • Shape: [num_rois_i]

  • Constraints:

    • Datatype: Same datatype as in[5]

    • Shape: shape(in[i])[0] must equal shape(in[i+5])[0] for i=1..4

    • Number: Number of in[6..9] provided must equal rpn_max_level - rpn_min_level

Parameters

rpn_min_level

Sets the minimum FPN level to support RPN transform operations on multiple FPN levels

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 2

  • Constraints:

    • Value: Must be in range [2,6]

rpn_max_level

Sets the maximum FPN level to support RPN transform operations on multiple FPN levels.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 6

  • Constraints:

    • Value: Must be in range [2,6]

    • Value: Must be >= rpn_min_level

post_nms_top

Sets a maximum number of proposals. The proposals with the lowest scores will be dropped to achieve this limit.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 2000

Outputs

out[0]

RoIs : Top proposals limited to post_nms_top total.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_proposals, 5] each line with format [batch_idx, x1, y1, x2, y2], where total_num_rois is the sum of all num_rois from in[0..4] and num_proposals = min(total_num_rois, post_nms_top).

CombinedNms

Performs Non Max Suppression (NMS) on the inputs per batch, across all classes. Boxes that have a high intersection-over-union (IOU) overlap with previously selected boxes are pruned. Additionally, bounding boxes with scores less than score_threshold are pruned. NMS can be either class-specific or class-agnostic depending on the shape of the in[0] that is provided. For the class-specific case NMS is performed independently for each class and class-agnostic case NMS will ignore all class labels and compare all bounding boxes.

References:

Inputs

in[0]

Bounding boxes: Elements can be understood as 4-tuples of bounding box coordinates given in the form (y1,x1,y2,x2), where (y1, x1) and (y2, x2) represent a diagonal pair of corners. Coordinates can be provided as normalized or absolute. Note that if q is equal to num_classes then class-specific boxes are used otherwise if q is equal to 1 then same boxes are used for all classes.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_boxes, q, 4]

  • Constraints:

    • Shape: q must be equal to 1 or num_classes

in[1]

Bounding box scores. The element at position [batch, box, class] is the score corresponding to class for the bounding box at the position [batch, box, q] in in[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_boxes, num_classes]

Parameters

max_boxes_per_class

Maximum number of boxes that can be selected by NMS per class.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

max_total_boxes

Maximum number of boxes to be retained over all classes after applying NMS.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

iou_threshold

Represents the threshold used by NMS algorithm to determine whether boxes overlap too much.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be in range [0, 1]

score_threshold

Boxes with scores lower than the threshold are filtered out by the NMS algorithm.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

pad_per_class

If true, the output boxes, scores, and classes padded to the size of min(max_boxes_per_class * num_classes, max_total_boxes). Otherwise, the output boxes, scores, and classes are padded to the size of max_total_boxes.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

clip_boxes

If true, the coordinates of selected output boxes in out[0] are clipped to [0, 1]. Otherwise, no clipping is done.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Selected Output boxes. Each element can be understood as a 4-tuple with the same meaning as in[0]. Note that the number of valid boxes is specified by out[3].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, max_num_boxes, 4], where max_num_boxes = min(max_boxes_per_class * num_classes, max_total_boxes) if pad_per_class is true. Otherwise, max_num_boxes = max_total_boxes.

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

Selected Output box scores. Gives the score for the box in the corresponding position in out[0]. Note that the number of valid scores is specified by out[3].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, max_num_boxes], where max_num_boxes = min(max_boxes_per_class * num_classes, max_total_boxes) if pad_per_class is true. Otherwise, max_num_boxes = max_total_boxes.

  • Constraints:

    • Datatype: Same datatype as in[1]

out[2]

Selected Output classes. Gives the class label for the box in the corresponding position in out[0]. Note that the number of valid classes is specified by out[3].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, max_num_boxes], where max_num_boxes = min(max_boxes_per_class * num_classes, max_total_boxes) if pad_per_class is true. Otherwise, max_num_boxes = max_total_boxes.

out[3]

Number of valid boxes per batch element that remain after NMS. Only the top entries of the output boxes, scores, and classes are valid. All other entries are zero padding.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [batch]

  • Constraints:

    • Value: <= max_num_boxes

Concat

Concatenates two or more input tensors along a provided axis.

Refer to Concat backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0..m]

input tensors. m >=1

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Number: m >= 1

    • Shape: Rank > 0

    • Shape: shape(in[0..m])[i] must be the same shape for all inputs, except at shape(in[0..m])[axis], which is permitted to vary across inputs.

    • Dynamic Shape: For any input, if shape(in[0..m])[i] is dynamic then shape(in[0..m])[i] must be dynamic across all inputs, except at the axis dimension.

Parameters

axis

Axis on which to concatenate input tensors

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: N-1

  • Constraints:

    • Value: Must be in range [0,N-1]

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Same rank as in[0]

    • Shape: Same shape as in[0], except shape(out[0])[axis] = sum(i, shape(in[i])[axis])

    • Dynamic Shape: If a dynamic dimension exists at shape(in[0..m])[axis] for any input then shape(out[0])[axis] must be dynamic. For all other dimensions, if shape(in[0..m])[i] is dynamic across all inputs then shape(out[0])[i] must be dynamic.

ConstantOfShape

Generates an output tensor with the given shape and value.

References:

Inputs

in[0]

Input tensor : a 1D tensor specifying the shape of the expected output tensor.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [N]

Parameters

value

The value to fill the output tensor with.

  • Mandatory: true

  • Data type: backend specific

  • Shape: scalar

Outputs

out[0]

Output tensor with shape specified by in[0] and value specified by value.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as value.

Conv1d

Performs 1D convolution: dot-product of a set of 1D filters with input activation, producing output activation.

Application of the filters moves according to the specified stride. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported. For regular convolution, group is 1. Group field greater than 1 implies a grouped convolution where a group of different filters is applied to each input channel group and the result is concatenated together.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, width, channel_in]

  • Constraints:

    • Shape: channel_in must be evenly divisible by group

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_width, channel_in / group, channel_out]

  • Constraints:

    • Shape: channel_out must be evenly divisible by group

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Default: {0,..,0}

Parameters

stride

Defines stride for 1D spatial axes of in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 1D spatial axes of in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [width_pad_before, width_pad_after]

group

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

dilation

Dilation parameter for width dimension.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

  • Constraints:

    • Value: Must be > 0

Outputs

out[0]

The output 1D spatial dimension is a function of the filter size, stride, and pad_amount.

dilated_filter_width = (shape(in[1])[width] - 1) * dilation + 1
width_out = floor((pad_amount[0] + shape(in[0])[width] + pad_amount[1] - dilated_filter_width) / stride + 1)
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, width_out, channel_out]

Conv2d

Performs 2D convolution: dot-product of a set of 2D filters with input activation, producing output activation.

Application of the filter moves according to the specified strides. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported. For regular convolution, group is 1. Group field greater than 1 implies a grouped convolution where a group of different filters is applied to each input channel group and the result is concatenated together.

Note that channel_out and channel_in must be evenly divisible by group.

Refer to Conv2d backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel_in]

  • Dynamic Shape: All dimensions can be dynamic.

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_height, filter_width, channel_in / group, channel_out]

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: shape(in[0])[channel_in] and shape(in[1])[channel_in/group] must both be dynamic or both static.

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Dynamic Shape: All dimensions can be dynamic.

  • Default: {0,..,0}

  • Constraints:

    • Dynamic Shape: shape(in[1])[channel_out] and shape(in[2])[channel_out] must both be dynamic or both static.

Parameters

stride

Defines stride for 2D spatial axes of in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 2D spatial axes of in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

group

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

dilation

Dilation parameter for height and width dimensions.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [height_dilation, width_dilation]

  • Default: [1, 1]

  • Constraints:

    • Value: Dilations must be > 0

reuse_sparse_indices

Only for sparse input and output tensors. If true, the resulting convolution re-uses the input indices for the output indices. Convolutions are only computed for the specified elements.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

The output 2D spatial dimensions are functions of the filter size, stride, and pad_amount.

dilated_filter_height = (shape(in[1])[filter_height] - 1) * dilation[0] + 1
dilated_filter_width = (shape(in[1])[filter_width] - 1) * dilation[1] + 1
height_out = floor((pad_amount[0,0] + shape(in[0])[height] + pad_amount[0,1] - dilated_filter_height) / stride[0] + 1)
width_out = floor((pad_amount[1,0] + shape(in[0])[width] + pad_amount[1,1] - dilated_filter_width) / stride[1] + 1)
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height_out, width_out, channel_out]

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: If shape(in[0])[batch] is dynamic, shape(out[0])[batch] must also be dynamic. If shape(in[0])[height] or shape(in[1])[filter_height] are dynamic, then shape(out[0])[height_out] must be dynamic. If shape(in[0])[width] or shape(in[1])[filter_width] are dynamic, then shape(out[0])[width_out] must be dynamic. If in[2] is not provided and shape(in[1])[channel_out] is dynamic or if in[2] is provided and both shape(in[1])[channel_out] and shape(in[2])[channel_out] are dynamic, then shape(out[0])[channel_out] must also be dynamic.

    • Must have same data format as in[0] (e.g. both sparse or both dense)

Conv3d

Performs 3D convolution: a spatial convolution over volumes using a set of 3D filters with input activation, producing output activation.

Application of the filter moves according to the specified strides. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported. For regular convolution, group is 1. Group field greater than 1 implies a grouped convolution where a group of different filters is applied to each input channel group and the result is concatenated together.

Note that channel_out and channel_in must be evenly divisible by group.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth, height, width, channel_in]

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_depth, filter_height, filter_width, channel_in / group, channel_out]

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Default: {0,..,0}

Parameters

stride

Defines stride for 3D spatial axes of in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [depth_stride, height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 3D spatial axes of in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3,2] with format [[depth_pad_before, depth_pad_after], [height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

group

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

dilation

Dilation parameter for depth, height and width dimensions.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [depth_dilation, height_dilation, width_dilation]

  • Default: [1, 1, 1]

  • Constraints:

    • Value: Dilations must be > 0

reuse_sparse_indicies

Only for sparse input and output tensors. If true, the resulting convolution re-uses the input indices for the output indices. Convolutions are only computed for the specified elements.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

The output 3D spatial dimensions depth, height, and width are functions of the filters size, stride, and pad_amount.

dilated_filter_depth = (shape(in[1])[depth] - 1) * dilation[0] + 1
dilated_filter_height = (shape(in[1])[height] - 1) * dilation[1] + 1
dilated_filter_width = (shape(in[1])[width] - 1) * dilation[2] + 1
depth_out = floor((pad_amount[0,0] + shape(in[0])[depth] + pad_amount[0,1] - dilated_filter_depth) / stride[0] + 1)
height_out = floor((pad_amount[1,0] + shape(in[0])[height] + pad_amount[1,1] - dilated_filter_height) / stride[1] + 1)
width_out = floor((pad_amount[2,0] + shape(in[0])[width] + pad_amount[2,1] - dilated_filter_width) / stride[2] + 1)
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth_out, height_out, width_out, channel_out]

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Must have same data format as in[0] (e.g. both sparse or both dense)

Convert

This operation converts input activation tensor to output activation tensor as per corresponding tensor data type. Unlike in Cast operation, quantization parameters as specified with Qnn_QuantizeParams_t are obeyed for fixed point data type conversions. The operation also provides optional support for data type changes at runtime.

Refer to Convert backend definition for support of dynamic dimensions for each backend. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

dynamic_input_data

Set QNN_OP_CONVERT_PARAM_DYNAMIC_INPUT_DATA to true to indicate that in[0] data type and associated buffer can change in between op execute invocations. It means that client is allowed to change data type and associated buffer of adequate size before QnnGraph_execute() call, subject to constraints and backend support.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: true is valid/allowed only for tensors of type QNN_TENSOR_TYPE_APP_WRITE or QNN_TENSOR_TYPE_APP_READWRITE.

dynamic_output_data

Set QNN_OP_CONVERT_PARAM_DYNAMIC_OUTPUT_DATA to true to indicate that out[0] data type and associated buffer can change in between op execute invocations. It means that client is allowed to change data type and associated buffer of adequate size before QnnGraph_execute() call, subject to constraints and backend support.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: true is valid/allowed only for tensors of type QNN_TENSOR_TYPE_APP_READ or QNN_TENSOR_TYPE_APP_READWRITE.

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

Correlation1D

Performs a depth-wise one-dimensional correlation along the width axis as shown:

\[n \in [0,\mbox{batch})\]
\[h \in [0,\mbox{height})\]
\[w \in [0,\mbox{width})\]
\[d \in [0,\mbox{depth})\]
\[w' = w + d - (\mbox{displacement} - \mbox{shift})\]
\[\mbox{When } w' < 0 \mbox{ or } w' \ge \mbox{ width}: \mbox{out[0]}[n,h,w,d] = 0\]
\[\mbox{Otherwise}: \mbox{out[0]}[n,h,w,d] = \frac{\sum^{\mbox{depth}-1}_{d'=0}\mbox{in[0]}[n,h,w,d']\cdot\mbox{in[1]}[n,h,w',d']}{\mbox{depth}}\]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, depth]

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, depth]

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

displacement

Maximum searching pixels in the horizontal direction

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

shift

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, displacement*2 + 1]

  • Constraints:

    • Datatype: Same datatype as in[0]

CreateSparse

Creates a sparse tensor from indices and values tensors.

Unspecified elements are considered to have a zero value. For quantized data types, this also implies that the offset is zero.

Note that sparse tensors can be partially sparse. For example, we accommodate cases where spatial dimensions are sparse, but the channel dimension is dense (e.g. RGB). We manage this with the definition of K, which represents the number of sparse dimensions. Note that all K sparse dimensions must be the outermost (slowest changing) dimensions. The N-K dense dimensions are the innermost (fastest changing) dimensions.

References:

Inputs

in[0]

indices

The elements of the in[0] indices tensor correspond to the respective element of the in[1] values tensor in the equivalent dense tensor.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32, QNN_DATATYPE_INT_32

  • Shape: [M, K] where 0 < K <= N. M may be dynamically sized and K is fixed.

in[1]

values

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N-K+1 with shape \([M,D_{K},...,D_{N-1}]\)

Parameters

None

Outputs

out[0]

output

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N with shape \([D_0,...,D_{N-1}]\)

  • Constraints:

    • Datatype: Same datatype as in[1]

    • Must be a sparse tensor.

CropAndResize

Extract crops from the input batch of images and resize them to a common specified output size. This operation may not preserve aspect ratio of the input crops. The resizing is corner-aligned.

References:

Inputs

in[0]

Input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,height,width,channel]

in[1]

Crop boxes : Each box defines the crop zone in one of the images in the input batch. Elements may be interpreted as 4-tuples of (y1,x1,y2,x2) representing normalized crop coordinates. A normalized coordinate value of y is mapped to the image coordinate at y * (image_height - 1), so as the [0, 1] interval of normalized image height is mapped to [0, image_height - 1] in image height coordinates. The condition (y1 > y2) is permitted, in which case the sampled crop is an up-down flipped version of the original image. The width dimension is treated similarly. Specifying coordinates outside of the range [0, 1] result in extrapolation of input image values.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_boxes,4]

in[2]

Batch index in the input tensor to which each box corresponds. Indices in in[2] indicate the images in the input batch that will be cropped using box coordinates in the same position in in[1].

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32

  • Shape: [num_boxes]

  • Constraints:

    • Value: in range [0, batch-1]

Parameters

resize_dims

The dimensions to which input images are cropped and resized to [resize_height, resize_width].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2]

interpolation_mode

Determines the interpolation method. Supported values are 0: BILINEAR, 1: NEAREST_NEIGHBOR.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • BILINEAR = 0,

    • NEAREST_NEIGHBOR = 1

extrapolation_value

Value used for extrapolation during the resize operation when applicable.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_boxes,resize_height,resize_width,channel].

  • Constraints:

    • Datatype: Same datatype as in[0]

CumulativeSum

Performs cumulative sum of the input elements along the given axis. By default, this op performs the sum inclusively meaning the first element of the input is identical to the first element of the output. An exclusive sum can be performed by setting the exclusive parameter to true. It can also perform summation in the opposite direction of the axis by setting the reverse parameter to true.

// Default case
When QNN_OP_CUMULATIVE_SUM_PARAM_EXCLUSIVE = 0 AND QNN_OP_CUMULATIVE_SUM_PARAM_REVERSE = 0 :
output[0] = input[0] is always true (for i = 0)
output[i] = input[0] + input[1] + ... + input[i] (for i > 0)

// Example
input[axis] = [1, 2, 3], exclusive = 0, reverse = 0
output[axis] = [1, 3, 6]

input[axis] = [1, 2, 3], exclusive = 1, reverse = 0
output[axis] = [0, 1, 3]

input[axis] = [1, 2, 3], exclusive = 0, reverse = 1
output[axis] = [6, 5, 3]

input[axis] = [1, 2, 3], exclusive = 1, reverse = 1
output[axis] = [5, 3, 0]

References:

Inputs

in[0]

Input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

Dimension index, starts at 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be in range [0, rank(in[0]) - 1]

exclusive

If QNN_OP_CUMULATIVE_SUM_PARAM_EXCLUSIVE is set to true an exclusive sum is performed where the i-th output element is the sum of the first (i - 1) elements.

When QNN_OP_CUMULATIVE_SUM_PARAM_EXCLUSIVE = 1 AND QNN_OP_CUMULATIVE_SUM_PARAM_REVERSE = 0 :
output[0] = 0 is always true (for i = 0)
output[i] = input[0] + input[1] + ... + input[i - 1] (for i > 0)
  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

reverse

If QNN_OP_CUMULATIVE_SUM_PARAM_REVERSE is set to true the sum is performed in the reverse direction.

When QNN_OP_CUMULATIVE_SUM_PARAM_EXCLUSIVE = 0 AND QNN_OP_CUMULATIVE_SUM_PARAM_REVERSE = 1 :
output[K] = input[K] is always true (for i = K, where K = input.dim[axis] - 1)
output[i] = input[K] + input[K - 1] + ... + input[i + 1] + input[i] (for 0 <= i < K)

When QNN_OP_CUMULATIVE_SUM_PARAM_EXCLUSIVE = 1 AND QNN_OP_CUMULATIVE_SUM_PARAM_REVERSE = 1 :
output[K] = 0 is always true (for i = K, where K = input.dim[axis] - 1)
output[i] = input[K] + input[K - 1] + ... + input[i + 2] input[i + 1] (for 0 <= i < K)
  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output data

  • Mandatory: true

  • Data type: backend specific

  • Shape: Same shape as in[0]

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

DepthToSpace

A type of tensor realignment operation that rearranges depth data into blocks of spatial data.

The op moves blocks of data of size (block_size[0] * block_size[1]) from the depth dimension of the input tensor into the spatial dimensions of the output tensor.

References:

Inputs

in[0]

Input Activation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [batch, height, width, depth]

  • Constraints:

    • Shape: depth must be divisible by (block_size[0] * block_size[1])

Parameters

block_size

Vector that represents block size along the height and width dimensions respectively.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [block_height, block_width]

  • Constraints:

    • Value: Elements must be >=1

mode

Specifies the order in which elements of in[0] are rearranged. If QNN_OP_DEPTH_TO_SPACE_PARAM_MODE is set to QNN_OP_DEPTH_TO_SPACE_MODE_DCR then elements along the depth dimension are rearranged in the order of depth, column, and then row; if set to QNN_OP_DEPTH_TO_SPACE_MODE_CRD elements along the depth dimension are rearranged in the order of column, row, and then depth.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • DCR = 0,

    • CRD = 1

Outputs

out[0]

Output Activation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [batch, (height * block_size[0]), (width * block_size[1]), depth / (block_size[0] * block_size[1])]

  • Constraints:

    • Datatype: Same datatype as in[0]

DepthWiseConv1d

Performs depthwise 1D convolution: dot-product of a set of 1D filters with input activation, producing output activation. Depthwise 1D convolution applies a different filter to each input channel group, then concatenates the results together. Depthwise 1D convolution is functionally equivalent to Conv 1D where ‘group’ parameter value == channel_in and channel_out is a multiple of channel_in, e.g. channel_out % channel_in == 0. Application of the filter moves according to the specified stride. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported.

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, width, channel_in]

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_width, 1, channel_out]

  • Constraints:

    • Shape: Channel_out must be divisible by channel_in

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Default: {0,..,0}

Parameters

stride

Defines stride for 1D spatial axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 1D spatial axes in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [width_pad_before, width_pad_after]

dilation

Dilation parameter for width dimension of in[0]. If set to d > 1, there will be d-1 skipped cells between each filter element on corresponding dimension.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

  • Constraints:

    • Value: Must be > 0

Outputs

out[0]

output activation : The output 1D spatial dimensions are functions of the filter_size, stride, and pad_amount.

dilated_filter_width = (shape(in[1])[width] - 1) * dilation + 1
width_out = floor((pad_amount[0] + shape(in[0])[width] + pad_amount[1] - dilated_filter_width) / stride + 1)
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, width_out, channel_out]

  • Constraints:

    • Datatype: Same datatype as in[0]

DepthWiseConv2d

Performs depthwise 2D convolution: dot-product of a set of 2D filters with input activation, producing output activation. Depthwise 2D convolution applies a different filter to each input channel group, then concatenates the results together. Depthwise 2D convolution is functionally equivalent to Conv 2D where ‘group’ parameter value == channel_in and channel_out is a multiple of channel_in, e.g. channel_out % channel_in == 0. Application of the filter moves according to the specified strides. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported.

Refer to DepthWiseConv2d backend definition for supported data type and layouts for each backend.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel_in]

  • Dynamic Shape: All dimensions can be dynamic.

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_height, filter_width, 1, channel_out]

  • Dynamic Shape: All dimensions can be dynamic with the exception of shape(in[1])[2].

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Dynamic Shape: All dimensions can be dynamic.

  • Default: {0,..,0}

  • Constraints:

    • Dynamic Shape: shape(in[1])[channel_out] and shape(in[2])[channel_out] must both be dynamic or both static.

Parameters

stride

Defines stride for 2D spatial axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_stride, width_stride]

pad_amount

Pad amount to be added to the beginning and end part of 2D spatial axes in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

dilation

Dilation parameter for height and width dimensions. If set to d > 1, there will be d-1 skipped cells between each filter element on corresponding dimension.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [height_dilation, width_dilation]

  • Default: [1, 1]

  • Constraints:

    • Value: Dilations must be > 0

Outputs

out[0]

output activation

The output 2D spatial dimensions are functions of the filter size, stride, and pad_amount.

dilated_filter_height = (shape(in[1])[height] - 1) * dilation[0] + 1
dilated_filter_width = (shape(in[1])[width] - 1) * dilation[1] + 1
height_out = floor((pad_amount[0,0] + shape(in[0])[height] + pad_amount[0,1] - dilated_filter_height) / stride[0] + 1)
width_out = floor((pad_amount[1,0] + shape(in[0])[width] + pad_amount[1,1] - dilated_filter_width) / stride[1] + 1)
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height_out, width_out, channel_out]

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: If shape(in[0])[batch] is dynamic, shape(out[0])[batch] must also be dynamic. If shape(in[0])[height] or shape(in[1])[filter_height] are dynamic, then shape(out[0])[height_out] must be dynamic. If shape(in[0])[width] or shape(in[1])[filter_width] are dynamic, then shape(out[0])[width_out] must be dynamic. If in[2] is not provided and shape(in[1])[channel_out] is dynamic or if in[2] is provided and both shape(in[1])[channel_out] and shape(in[2])[channel_out] are dynamic, then shape(out[0])[channel_out] must also be dynamic.

Dequantize

Dequantizes the input tensor. Note that scale and offset are determined from in[0].

Implements:

output = (input - offset) * scale.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_SFIXED_POINT_4, QNN_DATATYPE_UFIXED_POINT_4, QNN_DATATYPE_SFIXED_POINT_8, QNN_DATATYPE_UFIXED_POINT_8, QNN_DATATYPE_SFIXED_POINT_16, QNN_DATATYPE_UFIXED_POINT_16, QNN_DATATYPE_SFIXED_POINT_32, QNN_DATATYPE_UFIXED_POINT_32, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, backend specific

  • Shape:

  • Constraints:

    • Shape: Same shape as in[0]

DetectionOutput

Decodes a set of bounding boxes from a set of pre-defined anchors, then filters boxes using non-max-suppression (NMS).

References:

Inputs

in[0]

Scores for each class at each pre-defined anchor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_anchors, num_classes]

in[1]

Input box locations. Elements may be interpreted as [ctr_y, ctr_x, h, w] where ctr_y and ctr_x give the center position of the box, and h and w are the height and width of the box. The number of input boxes is computed as follows:

num_boxes = num_anchors if share_location = QNN_OP_DETECTION_OUTPUT_SHARE_LOCATION true
num_boxes = (num_anchors * num_classes) if share_location = QNN_OP_DETECTION_OUTPUT_SHARE_LOCATION false
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_boxes, 4]

in[2]

Anchor positions. Elements may be interpreted as [ctr_y, ctr_x, h, w] where ctr_y and ctr_x are the center position, and h and w are the height and width of the anchor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_anchors * batch, 4]

Parameters

delta_scaling_factors

Multiplicative scaling factors applied to each of the bounding boxes in in[1] in the form of (dy, dx, dh, dw), where dy and dx are linear-scale shifts with respect to width and height, and dh and dw are log-scale scaling factors with respect to the width and height of the boxes.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: [4]

confidence_threshold

Boxes with scores lower than this threshold are filtered prior to the application of NMS.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

iou_threshold

IoU threshold for the NMS algorithm.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

nms_type

Specifies which variant of NMS to use. Set QNN_OP_DETECTION_OUTPUT_NMS_TYPE_REGULAR for regular multi-class NMS, or QNN_OP_DETECTION_OUTPUT_NMS_TYPE_FAST for a faster variant which limits the number of classes to which NMS is applied. REGULAR: 1, FAST: 0

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • FAST = 0,

    • REGULAR = 1

background_class_idx

The index in num_classes of the “background” class. This class will be ignored by NMS.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

use_bg_in_nms

Choose to include background class in computing NMS. Set QNN_OP_DETECTION_OUTPUT_USE_BG_IN_NMS true to include the BG class, or to false to ignore.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

output_background

Set to QNN_OP_DETECTION_OUTPUT_OUTPUT_BACKGROUND true to include the background class in the output, or false to exclude the class.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

share_location

Set to QNN_OP_DETECTION_OUTPUT_SHARE_LOCATION true to indicate that the classes all share a common set of initial bounding boxes, and QNN_OP_DETECTION_OUTPUT_SHARE_LOCATION false to indicate that they use different initial bounding boxes.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

nms_eta

Adaptation factor for the NMS threshold. This factor is applied when nms_type is set to QNN_OP_DETECTION_OUTPUT_NMS_TYPE_REGULAR.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1.0

detection_limit

Parameter that specifies:

(i) the maximum number of classes per detection when nms_type is set to QNN_OP_DETECTION_OUTPUT_NMS_TYPE_FAST.

(ii) the maximum number of detections when applying NMS for each single class when nms_type is set to QNN_OP_DETECTION_OUTPUT_NMS_TYPE_REGULAR.

Parameter is ignored if set to default value. This parameter is similar to ‘nms_topK’ found in training frameworks like Caffe which set nms_type to QNN_OP_DETECTION_OUTPUT_NMS_TYPE_REGULAR.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Default: -1

Outputs

out[0]

Detection scores.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, max_num_detections]. max_num_detections can be expressed by maxDimensions in Qnn_Tensor_t.

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

Detection locations. Elements specify the coordinates of the output bounding boxes and can interpreted as 4-tuples with format (y1,x1,y2,x2) representing the upper-left and lower-right corner coordinates. A valid bounding box should satisfy x1 <= x2 and y2 <= y1. Note that quantized data types may provide output box coordinates that are out of range.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: [batch, max_num_detections, 4]. max_num_detections can be expressed by maxDimensions in Qnn_Tensor_t.

out[2]

Detection labels. Gives the class label for each detection.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [batch, max_num_detections]

out[3]

Valid number of detections per batch.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [batch]

DistributeFpnProposals

Distribute RPN proposals to their appropriate FPN levels for Faster RCNN. Note that RoI elements can be understood as 5-tuples with format (batch_idx, x1, y1, x2, y2) where batch_idx specifies the batch index of each RoI, (x1,y1) represents the upper-left corner coordinate, and (x2,y2) represents the lower-right corner coordinate.

Inputs

in[0]

RoIs : RPN proposals.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_proposals, 5] each line with format [batch_idx, x1, y1, x2, y2] or [num_proposals, 4] each line with format [x1, y1, x2, y2]

Parameters

roi_min_level

Sets the maximum FPN level to support RoI transform operations on multiple FPN levels.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 2

  • Constraints:

    • Value: Must be in range [2,5]

roi_max_level

Sets the maximum FPN level to support RoI transform operations on multiple FPN levels.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 5

  • Constraints:

    • Value: Must be in range [2,5]

    • Value: Must be >= roi_min_level

roi_canonical_scale

Scaling factor used to compute which FPN level each RoI in a set of RoIs should map to.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 224

roi_canonical_level

Value to offset the computed FPN level for each RoI.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 4

Outputs

out[0]

rois_idx_restore : indices to restore the RoIs to their original order from in[0]. Indices are in respect to the concatenation of the RoIs FPN outputs in order from FPN level roi_min_level to FPN level roi_max_level. For invalid RoIs having coordinate values (x2 - x1 = 0) and (y2 - y1 = 0) the corresponding index value will be set to -1.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32

  • Shape: [num_proposals]

  • Constraints:

    • Value: Must be in range [0, num_proposals * (roi_max_level * roi_min_level + 1) - 1]

out[1]

RoIs FPN : RoIs mapped to FPN level roi_min_level.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois, 5] each line with format [batch_idx, x1, y1, x2, y2] or [num_rois, 4] each line with format [x1, y1, x2, y2]

  • Constraints:

    • Shape: 0 <= num_rois <= num_proposals

    • Shape: shape(out[1])[1] must equal shape(in[0])[1]

out[2..4]

RoIs FPN : RoIs mapped to FPN levels [roi_min_level + 1, roi_max_level]. Note that each output may have a different shape since num_roi_i can vary for out[2..4].

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: false

  • Data type: backend specific

  • Shape: [num_rois_i, 5] each line with format [batch_idx, x1, y1, x2, y2] or [num_rois_i, 4] each line with format [x1, y1, x2, y2]

  • Constraints:

    • Shape: 0 <= num_rois_i <= num_proposals

    • Datatype: Same datatype as out[1]

    • Number: Number of out[2..4] provided must equal roi_max_level - roi_min_level

    • Value: The sum of num_rois across all RoIs FPN outputs must equal num_proposals

    • Shape: shape(out[2..4])[1] must equal shape(in[0])[1]

ElementWiseAbs

Computes absolute value of the input element-wise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseAdd

Adds two tensors element-wise. The output is the sum of input tensors.

out[0] = in[0] + in[1]

Refer to ElementWiseAdd backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[1])[i] is dynamic then shape(in[0])[i] must be dynamic or must be compatible for broadcasting.

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Rank > 0

    • Must have same data format as in[0] (e.g. both sparse or both dense)

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(in[1])[i] must be dynamic or must be compatible for broadcasting.

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Must have same data format as in[0] (e.g. both sparse or both dense)

    • Dynamic Shape: For each dimension, if both shape(in[0])[i] and shape(in[1])[i] are dynamic or if either shape(in[0])[i] or shape(in[1])[i] is dynamic and the other has a compatible dimension for broadcasting then shape(out[0])[i] must be dynamic.

ElementWiseAnd

Logical ANDs two tensors element-wise:

out[0] = in[0] && in[1]

where non-zero values are treated as true and zero as false.

References:

Inputs

in[0]

1st input tensor

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank N

in[1]

2nd input tensor

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWiseAsin

Computes the arcsine of the input element-wise. Note that arcsine behavior is undefined for input values outside the range [-1, 1].

\[\mbox{out[0]} = \arcsin{(\mbox{in[0]})}\]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseAtan

Computes the arctangent of the input element-wise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseBinary

Computes the specified binary operation on the input element-wise. Note that operations can be classified as being Numerical, Comparison, or Logical.

Available element-wise operations:

ADD - Adds two tensors element-wise. The output is the sum of input tensors.

\[\mbox{out[0]} = \mbox{in[0]} + \mbox{in[1]}\]

References:

AND - Logical ANDs two tensors element-wise.

\[\mbox{out[0]} = \mbox{in[0]}\ \&\&\ \mbox{in[1]}\]

References:

DIVIDE - Divides two tensors element-wise. The output is the result of dividing first input tensor by the second one. Result is undefined for any 0 value in the second input tensor.

\[\mbox{out[0]} = \frac{\mbox{in[0]}}{\mbox{in[1]}}\]

References:

EQUAL - Computes the equal logical operation element-wise on the two input tensors.

\[\begin{split}\mbox{out[0]} = \begin{cases} \text{true}\ & \text{if}\ \mbox{in[0]} == \mbox{in[1]}, \\ \text{false}\ & \text{otherwise.} \end{cases}\end{split}\]

References:

FLOOR_DIV - Divides two tensors element-wise and floor the result. Result is undefined for any 0 value in the second input tensor. Rounds towards the lowest integer. For Example: -2.3 rounds to -3 and 2.3 rounds to 2

\[\mbox{out[0]} = \text{floor}(\frac{\mbox{in[0]}}{\mbox{in[1]}})\]

References:

FMOD - Performs element-wise binary modulus. The sign of the remainder is the same as that of the dividend (in[0]). Note that behavior is undefined when elements from both in[0] and in[1] are 0.

\[\mbox{out[0]} = \mbox{in[0]}\ \%\ \mbox{in[1]}\]

References:

GREATER - Computes the greater than logical operation element-wise on the input tensors.

\[\begin{split}\mbox{out[0]} = \begin{cases} \text{true}\ & \text{if}\ \mbox{in[0]} > \mbox{in[1]}, \\ \text{false}\ & \text{otherwise.} \end{cases}\end{split}\]

References:

GREATER_EQUAL - Computes the greater than or equal logical operation element-wise on the input tensors.

\[\begin{split}\mbox{out[0]} = \begin{cases} \text{true}\ & \text{if}\ \mbox{in[0]}\ >=\ \mbox{in[1]}, \\ \text{false}\ & \text{otherwise.} \end{cases}\end{split}\]

References:

LESS - Computes the less than logical operation element-wise on the input tensors.

\[\begin{split}\mbox{out[0]} = \begin{cases} \text{true}\ & \text{if}\ \mbox{in[0]} < \mbox{in[1]}, \\ \text{false}\ & \text{otherwise.} \end{cases}\end{split}\]

References:

LESS_EQUAL - Computes the less than or equal logical operation element-wise on the input tensors.

\[\begin{split}\mbox{out[0]} = \begin{cases} \text{true}\ & \text{if}\ \mbox{in[0]}\ <=\ \mbox{in[1]}, \\ \text{false}\ & \text{otherwise.} \end{cases}\end{split}\]

References:

MAXIMUM - Element-wise maximum of two tensors.

\[\mbox{out[0]} = \text{max}(\mbox{in[0]}, \mbox{in[1]})\]

References:

MINIMUM - Element-wise minimum of two tensors.

\[\mbox{out[0]} = \text{min}(\mbox{in[0]}, \mbox{in[1]})\]

References:

MOD - Performs element-wise binary modulus. The sign of the remainder is the same as that of the divisor (in[1]). Note that behavior is undefined when elements from both in[0] and in[1] are 0. This operation does not support floating point data types use FMOD instead.

\[\mbox{out[0]} = \mbox{in[0]}\ \%\ \mbox{in[1]}\]

References:

MULTIPLY - Multiplies two tensors element-wise. The output is the result of multiplying first input tensor with the second one.

\[\mbox{out[0]} = \mbox{in[0]} \times \mbox{in[1]}\]

References:

NOT_EQUAL - Computes the not equal logical operation element-wise on the input tensors.

\[\begin{split}\mbox{out[0]} = \begin{cases} \text{true}\ & \text{if}\ \mbox{in[0]}\ !=\ \mbox{in[1]}, \\ \text{false}\ & \text{otherwise.} \end{cases}\end{split}\]

References:

OR - Logical ORs two tensors element-wise where non-zero values are treated as true and zero as false.

\[\mbox{out[0]} = \mbox{in[0]}\ \vert \vert\ \mbox{in[1]}\]

References:

POWER - Given base and exponent in input tensors, computes the (base^exponent) element-wise.

\[\mbox{out[0]} = (\mbox{in[0]})^{\mbox{in[1]}}\]

References:

SQUARED_DIFFERENCE - Computes the element-wise difference between 2 tensors by subtracting the second tensor from the first and squares the results element-wise.

\[\mbox{out[0]} = (\mbox{in[0]} - \mbox{in[1]}) \times (\mbox{in[0]} - \mbox{in[1]})\]

References:

SUBTRACT - Subtract two tensors element-wise. The output is the result of subtracting second input tensor from the first one.

\[\mbox{out[0]} = \mbox{in[0]} - \mbox{in[1]}\]

References:

XOR - Logical XORs two tensors element-wise where non-zero values are treated as true and zero as false.

\[\mbox{out[0]} = \mbox{in[0]}\ \text{^}\ \mbox{in[1]}\]

References:

Refer to ElementWiseBinary backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Note that the data type supported depends on the classification of the operation selected and can be categorized into one of the following:

  • Numerical: BACKEND_SPECIFIC

  • Comparison: BACKEND_SPECIFIC

  • Logical: BACKEND_SPECIFIC, QNN_DATATYPE_BOOL_8

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[1])[i] is dynamic then shape(in[0])[i] must be dynamic or must be compatible for broadcasting.

in[1]

Note that the data type supported depends on the classification of the operation selected and can be categorized into one of the following:

  • Numerical: BACKEND_SPECIFIC

  • Comparison: BACKEND_SPECIFIC

  • Logical: BACKEND_SPECIFIC, QNN_DATATYPE_BOOL_8

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0] for Numerical and Comparison operations with the exception of POWER.

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(in[1])[i] must be dynamic or must be compatible for broadcasting.

Parameters

operation

Specifies the binary element-wise operation to use.

Operations can be classified as one of the following:

  • Numerical operations: ADD, DIVIDE, FMOD, FLOOR_DIV, MAXIMUM, MINIMUM, MOD, MULTIPLY, POWER, SQUARED_DIFFERENCE, SUBTRACT

  • Comparison operations: EQUAL, GREATER, GREATER_EQUAL, LESS, LESS_EQUAL, NOT_EQUAL

  • Logical operations: AND, OR, XOR

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Values:

    • ADD = 0,

    • AND = 1,

    • DIVIDE = 2,

    • EQUAL = 3,

    • FLOOR_DIV = 4,

    • FMOD = 5,

    • GREATER = 6,

    • GREATER_EQUAL = 7,

    • LESS = 8,

    • LESS_EQUAL = 9,

    • MAXIMUM = 10,

    • MINIMUM = 11,

    • MOD = 12,

    • MULTIPLY = 13,

    • NOT_EQUAL = 14,

    • OR = 15,

    • POWER = 16,

    • SQUARED_DIFFERENCE = 17,

    • SUBTRACT = 18,

    • XOR = 19

Outputs

out[0]

Note that the data type supported depends on the classification of the operation selected and can be categorized into one of the following:

  • Numerical: BACKEND_SPECIFIC

  • Comparison: BACKEND_SPECIFIC, QNN_DATATYPE_BOOL_8

  • Logical: BACKEND_SPECIFIC, QNN_DATATYPE_BOOL_8

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of Rank = max(N,M)

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0] for Numerical operations

    • Dynamic Shape: For each dimension, if both shape(in[0])[i] and shape(in[1])[i] are dynamic or if either shape(in[0])[i] or shape(in[1])[i] is dynamic and the other has a compatible dimension for broadcasting then shape(out[0])[i] must be dynamic.

ElementWiseCeil

Computes elementwise ceil on the input. Returns elementwise smallest integer not less than the input.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseCos

Computes the cosine of the input element-wise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseDivide

Divides two tensors element-wise. The output is the result of dividing first input tensor by the second one. Result is undefined for any 0 value in the second input tensor.

out[0] = in[0] / in[1]

Refer to ElementWiseDivide backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[1])[i] is dynamic then shape(in[0])[i] must be dynamic or must be compatible for broadcasting.

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(in[1])[i] must be dynamic or must be compatible for broadcasting.

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if both shape(in[0])[i] and shape(in[1])[i] are dynamic or if either shape(in[0])[i] or shape(in[1])[i] is dynamic and the other has a compatible dimension for broadcasting then shape(out[0])[i] must be dynamic.

ElementWiseEqual

Computes the equal logical operation elementwise on the two input tensors.

out[0] = in[0] == in[1]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWiseExp

Computes exponential of input element-wise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseFloor

Computes elementwise floor on the input.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseFloorDiv

Divides two tensors elementwise and floor the result. Result is undefined for any 0 value in the second input tensor. Rounds towards the lowest integer. For Example: -2.3 rounds to -3 and 2.3 rounds to 2

out[0] = floor(in[0] / in[1])

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

ElementWiseFmod

Performs element-wise binary modulus. The sign of the output elements is the same as elements in in[0]. Note that behavior is undefined when elements from both in[0] and in[1] are 0.

\[\mbox{out[0]} = \mbox{in[0]} \ \% \ \mbox{in[1]}\]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

ElementWiseGreater

Computes the greater than logical operation elementwise on the input tensors.

out[0] = in[0] > in[1]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWiseGreaterEqual

Computes the greater than or equal logical operation elementwise on the input tensors.

out[0] = in[0] >= in[1]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWiseLess

Computes the less than logical operation elementwise on the input tensors.

out[0] = in[0] < in[1]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWiseLessEqual

Computes the less than or equal logical operation elementwise on the input tensors. Tensors must have same rank and tensor dimensions must be compatible. Two dimensions are compatible when they are equal or in[1] dimension is 1 (i.e. in[1] is broadcast across in[0]).

out[0] = in[0] <= in[1]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWiseLog

Computes logarithm of input element-wise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseMaximum

Element-wise maximum of two tensors.

out[0] = max(in[0], in[1])

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

ElementWiseMinimum

Element-wise minimum of two tensors.

out[0] = min(in[0], in[1])

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

ElementWiseMod

Performs element-wise binary modulus. The sign of the output elements is the same as elements in in[1]. Note that behavior is undefined when elements from both in[0] and in[1] are 0.

\[\mbox{out[0]} = \mbox{in[0]} \ \% \ \mbox{in[1]}\]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Does not support floating point data types

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

ElementWiseMultiply

Multiplies two tensors element-wise. The output is the result of multiplying first input tensor with the second one.

out[0] = in[0] * in[1]

Refer to ElementWiseMultiply backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[1])[i] is dynamic then shape(in[0])[i] must be dynamic or must be compatible for broadcasting.

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(in[1])[i] must be dynamic or must be compatible for broadcasting.

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if both shape(in[0])[i] and shape(in[1])[i] are dynamic or if either shape(in[0])[i] or shape(in[1])[i] is dynamic and the other has a compatible dimension for broadcasting then shape(out[0])[i] must be dynamic.

ElementWiseNeg

Computes numerical negative of input element-wise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseNeuron

Computes the specified operation on the input element-wise.

Available element-wise operations:

Elu - Computes the Exponential Linear Unit operation.

\[\mbox{out[0]} = \max(0, \mbox{in[0]}) + \min(0, \alpha \times (e^{\mbox{in[0]}} - 1))\]

References:

Gelu - Computes the Gaussian error linear unit operation.

\[\mbox{out[0]} = \frac{\mbox{in[0]}}{2} \times (1 + \text{erf} (\frac{\mbox{in[0]}}{\sqrt{2}}))\]

References:

HardSigmoid - Computes the HardSigmoid function element-wise on the input.

\[\mbox{out[0]} = \max(0, \min(1, \alpha \times \mbox{in[0]} + \beta))\]

References:

HardSwish - Computes the HardSwish operation.

\[\mbox{out[0]} = \frac{\mbox{in[0]} \times \max(0, \min(6, \mbox{in[0]} + 3))}{6}\]

References:

Relu - Computes the Rectified Linear Unit operation.

\[\mbox{out[0]} = \max(0, \mbox{in[0]})\]

References:

ReluMinMax - Computes the Rectified Linear Unit Min Max operation.

\[\mbox{out[0]} = \min(\text{max_value}, \max(\text{min_value}, \mbox{in[0]}))\]

where min_value <= max_value.

References:

Sigmoid - Computes the sigmoid activation function element-wise.

\[\mbox{out[0]} = \frac{1}{1 + e^{(-1 \times \mbox{in[0]})}}\]

References:

Softplus - Computes the softplus function to the input tensor element-wise. Note that when \((\mbox{in[0]} \times \beta)\) > threshold the implementation reverts to a linear function to preserve numerical stability.

\[\begin{split}\mbox{out[0]} = \begin{cases} \mbox{in[0]} & \text{if}\ \mbox{in[0]} \times \beta > \text{threshold}, \\ \frac{1}{\beta} \times \ln{(e^{(\beta \times \mbox{in[0]})} + 1)} & \text{otherwise} \end{cases}\end{split}\]

References:

Tanh - Computes the hyperbolic tangent function element-wise.

\[\mbox{out[0]} = \tanh{(\mbox{in[0]})}\]

References:

Refer to ElementWiseNeuron backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

operation

Specifies the element-wise operation to use.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Values:

    • ELU = 0,

    • GELU = 1,

    • HARD_SIGMOID = 2,

    • HARD_SWISH = 3,

    • RELU = 4,

    • RELU_MIN_MAX = 5,

    • SIGMOID = 6,

    • SOFTPLUS = 7,

    • TANH = 8

alpha (\(\alpha\))

The alpha (\(\alpha\)) value for Elu and HardSigmoid function.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default:

    • 1.0 for QNN_OP_ELEMENT_WISE_NEURON_OPERATION_ELU

    • 0.2 for QNN_OP_ELEMENT_WISE_NEURON_OPERATION_HARD_SIGMOID

  • Constraints:

    • Must have operation set to ELU or HARD_SIGMOID for this parameter to be valid.

beta (\(\beta\))

The beta (\(\beta\)) value for HardSigmoid and Softplus functions.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default:

    • 0.5 for QNN_OP_ELEMENT_WISE_NEURON_OPERATION_HARD_SIGMOID

    • 1.0 for QNN_OP_ELEMENT_WISE_NEURON_OPERATION_SOFTPLUS

  • Constraints:

    • Must have operation set to HARD_SIGMOID or SOFTPLUS for this parameter to be valid.

    • Value: beta provided for SOFTPLUS must be > 0.

min_value

The minimum value in operation RELU_MIN_MAX

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: None

  • Constraints:

    • Must have operation set to RELU_MIN_MAX for this parameter to be valid.

    • Must be provided when operation is set to RELU_MIN_MAX.

    • Value: min_value must be <= max_value

max_value

The maximum value in operation RELU_MIN_MAX

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: None

  • Constraints:

    • Must have operation set to RELU_MIN_MAX for this parameter to be valid.

    • Must be provided when operation is set to RELU_MIN_MAX.

threshold

Values above the threshold revert to a linear function. Note that parameter is disabled by default or when set to a negative.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: -1.0

  • Constraints:

    • Must have operation set to SOFTPLUS for this parameter to be valid.

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

ElementWiseNot

Logical NOT of a tensor element-wise:

out[0] = 1 if in[0] == 0
out[0] = 0 otherwise

where non-zero values are treated as true and zero as false.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseNotEqual

Computes the not equal logical operation elementwise on the input tensors.

out[0] = in[0] != in[1]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWiseOr

Logical ORs two tensors element-wise:

out[0] = in[0] || in[1]

where non-zero values are treated as true and zero as false.

References:

Inputs

in[0]

1st input tensor

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank N

in[1]

2nd input tensor

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: A Tensor of rank M

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

ElementWisePower

Given base and exponent in input tensors, computes the (base^exponent) element-wise.

out[0] = in[0] ^ in[1]

Refer to ElementWisePower backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Tensor specifying the base.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[1])[i] is dynamic then shape(in[0])[i] must be dynamic or must be compatible for broadcasting.

in[1]

Tensor specifying the exponent.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(in[1])[i] must be dynamic or must be compatible for broadcasting.

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if both shape(in[0])[i] and shape(in[1])[i] are dynamic or if either shape(in[0])[i] or shape(in[1])[i] is dynamic and the other has a compatible dimension for broadcasting then shape(out[0])[i] must be dynamic.

ElementWiseRound

Computes elementwise rounding on the input.

The operation rounds the values in the input to the nearest integer.

Halfs are rounded to the nearest even integer.

E.g: 2.5 is rounded to 2.0; -3.5 is rounded to -4.0.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseRsqrt

Computes reciprocal of the square root of the input tensor element-wise. Negative elements are unsupported. If an element is negative, behaviour is undefined.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseSelect

Given a boolean condition tensor, select value in first input tensor (if true) or value in second input tensor (if false). Note that the three input tensors must be either of the same shape or be able to broadcast to a common shape.

out[0] = in[0] ? in[1] : in[2]

References:

Inputs

in[0]

condition input

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Shape: Rank > 0

in[2]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank K

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N, M, K)

ElementWiseSign

Computes the sign of the input tensor element-wise.

\[\begin{split}\mbox{out[0]} = \text{Sign}(\mbox{in[0]}) = \begin{cases} 1 & \text{if}\ \mbox{in[0]} > 0, \\ 0 & \text{if}\ \mbox{in[0]} = 0, \\ -1 & \text{otherwise} \end{cases}\end{split}\]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseSin

Computes the sin of the input element-wise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseSoftplus

Applies the softplus function to the input tensor element-wise to produce an output tensor.

\[\mbox{out[0]} = \frac{1}{\beta} * \ln{(e^{\beta * \mbox{in[0]}} + 1)}\]

Note that when in[0] * \(\beta\) > threshold the implementation reverts to a linear function to preserve numerical stability.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

beta (\(\beta\))

The beta (\(\beta\)) value for the Softplus formulation.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1.0

  • Constraints:

    • Value: beta must be > 0

threshold

Values above the threshold revert to a linear function. Note that parameter is disabled by default or when set to a negative.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: -1.0

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ElementWiseSquaredDifference

Computes the element-wise difference between 2 tensors by subtracting the second tensor from the first and squares the results element-wise.

out[0] = (in[0] - in[1]) * (in[0] - in[1])

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

ElementWiseSquareRoot

Computes the square root of the input tensor element-wise. Negative elements are unsupported. If an element is negative, behaviour is undefined.

Refer to ElementWiseSquareRoot backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

ElementWiseSubtract

Subtract two tensors element-wise. The output is the result of subtracting second input tensor from the first one.

out[0] = in[0] - in[1]

Refer to ElementWiseSubtract backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[1])[i] is dynamic then shape(in[0])[i] must be dynamic or must be compatible for broadcasting.

in[1]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(in[1])[i] must be dynamic or must be compatible for broadcasting.

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = max(N,M)

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if both shape(in[0])[i] and shape(in[1])[i] are dynamic or if either shape(in[0])[i] or shape(in[1])[i] is dynamic and the other has a compatible dimension for broadcasting then shape(out[0])[i] must be dynamic.

ElementWiseUnary

Computes the specified unary operation on the input element-wise. Note that operations can be classified as being Numerical or Logical.

Available element-wise operations:

ABS - Computes the absolute value element-wise.

\[\mbox{out[0]} = \lvert \mbox{in[0]} \rvert\]

References:

ASIN - Computes the arcsine element-wise. Note that if elements are outside the range [-1, 1] behavior is undefined.

\[\mbox{out[0]} = \arcsin{(\mbox{in[0]})}\]

References:

ATAN - Computes the arctangent element-wise.

\[\mbox{out[0]} = \arctan{(\mbox{in[0]})}\]

References:

CEIL - Applies ceil() on the input element-wise. E.g. ceil([-1.5, 1.2, 2.0]) will output [-1.0, 2.0, 2.0].

\[\mbox{out[0]} = \text{ceil}(\mbox{in[0]})\]

References:

COS - Computes the cosine of the input element-wise.

\[\mbox{out[0]} = \cos{(\mbox{in[0]})}\]

References:

EXP - Computes the exponential of the input element-wise.

\[\mbox{out[0]} = e^{(\mbox{in[0]})}\]

References:

FLOOR - Applies floor() on the input element-wise. E.g. floor([-1.5, 1.2, 2.0]) will output [-2.0, 1.0, 2.0].

\[\mbox{out[0]} = \text{floor}(\mbox{in[0]})\]

References:

LOG - Computes the logarithm of the input element-wise. Note that if an element is 0 behavior is undefined.

\[\mbox{out[0]} = \log{(\mbox{in[0]})}\]

References:

NEG - Computes the numerical negative of the input element-wise.

\[\mbox{out[0]} = -1 \times \mbox{in[0]}\]

References:

NOT - Applies logical NOT on the input element-wise.

\[\begin{split}\mbox{out[0]} = \text{Not}(\mbox{in[0]}) = \begin{cases} 1 & \text{if}\ \mbox{in[0]} = 0, \\ 0 & \text{otherwise.} \end{cases}\end{split}\]

References:

RECIPROCAL - Computes the reciprocal of the input element-wise. Note that if an element is 0 behavior is undefined.

\[\mbox{out[0]} = \frac{1}{\mbox{in[0]}}\]

References:

ROUND - Rounds to the nearest integer element-wise. Note that elements at halves are rounded to the nearest even number. E.g. round([2.5, -4.5, 1.5]) will output [2.0, -4.0, 2.0].

\[\mbox{out[0]} = \text{round}(\mbox{in[0]})\]

References:

RSQRT - Computes the reciprocal of the square root element-wise. Negative elements are unsupported. If an element is negative or 0, behavior is undefined.

\[\mbox{out[0]} = \frac{1}{\sqrt{\mbox{in[0]}}}\]

References:

SIGN - Computes the sign of the input element-wise.

\[\begin{split}\mbox{out[0]} = \text{sign}(\mbox{in[0]}) = \begin{cases} 1 & \text{if}\ \mbox{in[0]} > 0, \\ 0 & \text{if}\ \mbox{in[0]} = 0, \\ -1 & \text{otherwise.} \end{cases}\end{split}\]

References:

SIN - Computes the sine of the input element-wise.

\[\mbox{out[0]} = \sin{(\mbox{in[0]})}\]

References:

SQRT - Computes the square root of the input element-wise. Negative elements are unsupported. If an element is negative, behavior is undefined.

\[\mbox{out[0]} = \sqrt{\mbox{in[0]}}\]

References:

Refer to ElementWiseUnary backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Note that the data type supported depends on the classification of the operation selected and can be categorized into one of the following:

  • Numerical: BACKEND_SPECIFIC

  • Logical: BACKEND_SPECIFIC, QNN_DATATYPE_BOOL_8

See operation parameter for classification per operation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

operation

Specifies the unary element-wise operation to use.

Operations can be classified as one of the following:

  • Numerical operations: ABS, ASIN, ATAN, CEIL, COS, EXP, FLOOR, LOG, NEG, RECIPROCAL, ROUND, RSQRT, SIGN, SIN, and SQRT.

  • Logical operations: NOT.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Values:

    • ABS = 0,

    • ASIN = 1,

    • ATAN = 2,

    • CEIL = 3,

    • COS = 4,

    • EXP = 5,

    • FLOOR = 6,

    • LOG = 7,

    • NEG = 8,

    • NOT = 9,

    • RECIPROCAL = 10,

    • ROUND = 11,

    • RSQRT = 12,

    • SIGN = 13,

    • SIN = 14,

    • SQRT = 15

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

ElementWiseXor

Logical XOR operation elementwise on two input tensors:

out[0] = in[0] ^ in[1]

where non-zero values are treated as true and zero as false.

References:

Inputs

in[0]

1st input tensor

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank N

in[1]

2nd input tensor

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: A Tensor of rank M

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8, backend specific

  • Shape: a tensor of rank = max(N,M)

Elu

The Exponential Linear Unit operation computes:

output= max(0, input) + min(0, alpha * (exp(input) - 1))

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

alpha

Alpha parameter

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1.0

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ExpandDims

This operation inserts a dimension of size 1 at the dimension index axis or dimension indices axes of in[0] tensor. The number of elements in output tensor remains the same as in input tensor. This functionality can also be achieved using the Reshape operation.

Refer to ExpandDims backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank N > 0

Parameters

axis

Scalar index specifying the dimension to insert a value of 1.

Both scalar axis and its tensor counterpart axes are optional, but at least one of them must be provided. However, if axes is non-empty, the value of axis is ignored.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: Must be in range [0,M)

axes

Indices specifying the dimensions to insert a value of 1.

Both scalar axis and its tensor counterpart axes are optional, but at least one of them must be provided. When both are specified, values of axes take precedence over scalar axis.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [K]

  • Default: {0,..,0}

  • Constraints:

    • Value: Must be in range [0, M)

    • Value: Must be unique

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M, where M = N + 1 if axis is used. Otherwise, M = N + K.

  • Dynamic Shape: All dimensions other than the inserted dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0] other than the expanded dimensions

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then the corresponding dimension of out[0] after being expanded must be dynamic.

ExtractGlimpse

Extracts a set of windows called glimpses from the input tensor. If the window only partially overlaps the input, the non-overlapping areas will be filled with noise.

References:

Inputs

in[0]

Input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel]

in[1]

Glimpse offsets

  • If normalized and centered, the offsets point to the center of the glimpse. Otherwise, the offsets point to the upper-left of the glimpse.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, 2] each batch with format [y, x]

Parameters

size

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [2] with format [glimpse_height, glimpse_width]

centered

If true, offset coordinates are relative to the center of image. Otherwise, offset coordinates are relative to the upper-left corner of the image.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

normalized

If not normalized and not centered, the offsets are treated as the number of pixels from the upper-left of the image.

  • (0.0,0.0) corresponds to the image upper-left corner.

  • (height - 1.0,width - 1.0) corresponds to the image bottom-right corner.

If not normalized and centered, the offsets are treated as the number of pixels from the center of the image.

  • (0.0,0.0) corresponds to the image center.

  • (height / 2.0,width / 2.0) corresponds to the image bottom-right corner.

If normalized and not centered, normalized coordinates are in the range [0.0,1.0] and correspond to the minimum and maximum of each height and width.

  • (0.0,0.0) corresponds to the image upper-left corner.

  • (0.5,0.5) corresponds to the image center.

  • and (1.0,1.0) corresponds to the image bottom-right corner.

If normalized and centered, normalized coordinates are in the range [-1.0,1.0] where

  • (-1.0,-1.0) corresponds to the image upper-left corner.

  • (0.0,0.0) corresponds to the image center.

  • and (1.0,1.0) corresponds to the image bottom-right corner.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

noise

Indicates the noise distribution.

  • 0: Uniform

  • 1: Gaussian

  • 2: Zeroes

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • Uniform = 0,

    • Gaussian = 1,

    • Zeroes = 2

Outputs

out[0]

Output data

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, glimpse_height, glimpse_width, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

ExtractPatches

Extracts patches from the input image in[0]. Patches are of shape sizes and are strides apart in the input image. All extracted patches are flattened and stacked in the channel_out dimension of out[0].

References:

Inputs

in[0]

Input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel_in]

Parameters

size

The size of the extracted patches.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [height_size, width_size]

  • Constraints:

    • Value: Sizes must be > 0

stride

Determines how far the centers of two consecutive patches are in the images.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

rate

Specifies how far two consecutive patch samples are in the input.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [height_rate, width_rate]

  • Constraints:

    • Value: Rates must be > 0

padding

When set to QNN_OP_EXTRACT_PATCHES_PADDING_VALID: Only patches which are fully contained in the input image are included.

When set to QNN_OP_EXTRACT_PATCHES_PADDING_SAME: All patches with starting points inside the input are included and areas outside the input default to zero.

Note that padding has no effect on the size of each patch and only determines how many patches are extracted.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Values:

    • VALID = 0,

    • SAME = 1

Outputs

out[0]

Output patches : contains image patches of size height_size * width_size * channel_in flattened in the channel_out dimension. Spatial dimensions height_out and width_out are functions of sizes, strides, rates, and padding.

// When padding is set to QNN_OP_EXTRACT_PATCHES_PADDING_VALID
dilated_height = size[0] + (size[0] - 1) * (rate[0] - 1)
dilated_width = size[1] + (size[1] - 1) * (rate[1] - 1)
shape(out[0])[height_out] = floor((shape(in[0])[height] - dilated_height) / stride[0] + 1)
shape(out[0])[width_out] = floor((shape(in[0])[width] - dilated_width) / stride[1] + 1)

// When padding is set to QNN_OP_EXTRACT_PATCHES_PADDING_SAME
shape(out[0])[height_out] = ceil(shape(in[0])[height] / stride[0])
shape(out[0])[width_out] = ceil(shape(in[0])[width] / stride[1])
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height_out, width_out, channel_out], where channel_out = height_size * width_size * channel_in.

  • Constraints:

    • Datatype: Same datatype as in[0]

FullyConnected

The FullyConnected operation connects the all input elements with each output element through weights and biases. The weights tensor has shape [m, n] where n is the number of input elements and m is the units of weights, output and optional biases. The input activation must be reshapable to [batch, n] (see Reshape operation definition) and the operation computes mathematically:

outputVector = ( inputAsVector * weightsMatrix ) + biasesVector

for one batch of input, where * denotes matrix multiply operation.

Refer to FullyConnected backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [n] or Rank >= 2 reshapable to [batch, n]

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

    • Shape: Rank >= 2 must be reshapable to [batch, n]

in[1]

weights

  • Mandatory: true

  • Data type: backend specific

  • Shape: [m, n]

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: if rank(in[0]) = 1, then shape(in[0])[n] and shape(in[1])[n] must be both dynamic or both static. Otherwise if rank(in[0]) > 1, shape(in[1])[n] must be static.

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [m]

  • Dynamic Shape: All dimensions can be dynamic.

  • Default: [0]

  • Constraints:

    • Dynamic Shape: shape(in[1])[m] and shape(in[2])[m] must be both dynamic or both static.

Parameters

keep_dims

If true, the rank of in[0] and out[0] will remain the same, and all but the last dimension will be equal in shape.

For dimensions to be preserved, the product of the batch dimensions of in[0] (all but the last dimension) must be equal to batch, defined by in[0] above. This is because:

total # of outputs = (total # of batches) * m

Since the total # of outputs and m are the same regardless of keep_dims, the total # of batches must remain the same as well.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: If the rank of in[0] is 1: [m]. If the rank of in[0] is > 1: [batch, m], unless keep_dims is true, then […, m] where … is all but the last dimension of in[0]

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: If rank(in[0]) > 1 and keep_dims is set to false and any dimension for in[0] is dynamic then shape(out[0])[batch] must be dynamic. Otherwise, if rank(in[0]) > 1 and keep_dims is set to true, for each dimension with the exemption of the last dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

    • Dynamic Shape: If shape(in[1])[m] is dynamic then shape(out[0])[m] must be dynamic.

Gather

Gather input data from the specified axis and indices.

Note that for indices generated from other Operations (e.g. NonZero) we permit -1 to be provided as a value to indicate an index for Gather Operation to skip/ignore.

Refer to Gather backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Input activation, also known as table

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

in[1]

Indices in in[0] to extract based on axis.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32, backend specific

  • Shape: a tensor of rank k

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Value: Indices must be in range [0, shape(in[0])[axis] - 1]

    • Shape: Rank > 0

Parameters

axis

axis : The axis to gather.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be in range [0, n-1]

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = n + k - 1

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[1])[i] is dynamic then shape(out[0])[axis + i] must be dynamic. For all other dimensions except the axis dimension, if shape(in[0])[i] is dynamic then the corresponding dimension of out[0] must be dynamic.

GatherElements

Gather input data from the specified axis and indices, where each index value pertains to a single value from the input data.

The following example demonstrates how the output is produced in a 3D case (n = 3):

output[i][j][k] = input[index[i][j][k]][j][k] (if axis = 0)
output[i][j][k] = input[i][index[i][j][k]][k] (if axis = 1)
output[i][j][k] = input[i][j][index[i][j][k]] (if axis = 2)

Note that for indices generated from other Operations (e.g. NonZero) we permit -1 to be provided as a value to indicate an index for GatherElements Operation to skip/ignore.

References:

Inputs

in[0]

Input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n

  • Constraints:

    • Shape: Rank > 0

in[1]

Index tensor : contains indices in in[0] to extract based on axis.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32

  • Shape: a tensor of rank n

  • Constraints:

    • Value: Indices must be in range [0, shape(in[0])[axis] - 1]

Parameters

axis

axis : The axis of in[0] to gather on.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: Must be in range [0, n-1]

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[1]

GatherNd

Gather slices of input data from the specified indices. This is similar to Gather operation, where indices define slices into the first dimension of the input data. In GatherNd, indices define slices into the first N dimensions of the input data, where N = Shape(in[1])[k-1].

Note that for indices generated from other Operations (e.g. NonZero) we permit -1 to be provided as a value to indicate an index for GatherNd Operation to skip/ignore.

References:

Inputs

in[0]

Input activation, also known as table

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n

  • Constraints:

    • Shape: Rank > 0

in[1]

Indices in in[0] to extract. Note that when Shape(in[1])[k-1] is equal to n, the indices provided index the full rank of in[0] and gather scalars. Otherwise, when Shape(in[1])[k-1] is less than n, the indices provided refer to row slices of in[0] to gather.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32, backend specific

  • Shape: a tensor of rank k

  • Constraints:

    • Shape: Shape(in[1])[k-1] must be > 0 and <= (n - batch_dims)

    • Value: Indices must be non-negative and within the range of the corresponding dimension of in[0]

    • Shape: Rank > 0

Parameters

batch_dims

The number of batch dimensions. Note that gather starts indexing from dimension in[0][batch_dims:].

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: batch_dims < min(n, k)

    • Shape: Shape(in[0])[:batch_dims] must equal Shape(in[1])[:batch_dims]

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank = n + k - Shape(in[1])[k-1] - batch_dims - 1, where n is the rank of in[0] and k is the rank of in[1].

  • Constraints:

    • Datatype: Same datatype as in[0]

Gelu

The Gaussian error linear unit operation computes:

out[0] = in[0]/2 * (1 + erf(in[0] / sqrt(2)))

References:

Inputs

in[0]

Input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

Output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

GenerateProposals

Generates bounding-box proposals from an input feature-map by applying a transform to a set of predefined bounding-box anchors. The number of proposals is then limited by applying hard non-max suppression (NMS).

References:

Inputs

in[0]

Input feature map

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,height,width,num_anchors]

in[1]

Transform tensor. Elements can be considered 4-tuples of (dx,dy,dw,dh), as described in the reference.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,height,width,num_anchors*4]

in[2]

Anchor tensor. Elements can be considered 4-tuples of [x1,y1,x2,y2] coordinates in the original image.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_anchors,4]

  • Constraints:

    • Datatype: Same datatype as in[0]

in[3]

Image sizes tensor. Elements should be interpreted as [image_height,image_width] for each image in the batch.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,2]

  • Constraints:

    • Datatype: Same datatype as in[2]

Parameters

img_size_ratio

Gives the ratio between the original image and the feature map, in the form [height_ratio, width_ratio]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: [2] with elements [height_ratio, width_ratio]

min_size

Sets a minimum size for boxes before applying NMS. Boxes with width or height smaller than this value will be filtered out.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0

pre_nms_limit

Sets a maximum number of boxes before applying NMS. The boxes with the lowest scores will be dropped to achieve this limit.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: No limit is applied

post_nms_limit

Sets a maximum number of boxes after applying NMS. The boxes with the lowest scores will be dropped to achieve this limit.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: No limit is applied

iou_threshold

IoU threshold for the NMS operation

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

bbox_xform_clip

When set to true bounding box are clipped to minimum bounding box height and width after transformation. Otherwise, no clipping is done.

bbox_xform_clip_value = log(1000.0 / 16.0)
dh = min(dh, bbox_xform_clip_value)
dw = min(dw, bbox_xform_clip_value)
  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Gives the score for each bounding box. Boxes corresponding to a given input batch element are grouped contiguously.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_boxes]. Max and current dimension value for num_boxes may differ, and current value will be updated by the backend.

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

Gives the position for each bounding box as a 4-tuple, (x1,y1,x2,y2). Positions in this output correspond to the position of the same box in out[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_boxes,4]. Max and current dimension value for num_boxes may differ, and will be updated by the backend.

  • Constraints:

    • Datatype: Same datatype as in[3]

out[2]

Gives the batch index of each bounding box. Positions in this output correspond to the position of the same box in out[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [num_boxes]. Max and current dimension value for num_boxes may differ, and will be updated by the backend.

GetSparseIndices

Gets the M specified indices from a sparse tensor.

See the CreateSparse op definition for a description of K and partially sparse tensors and a description of the indices tensor.

References:

Inputs

in[0]

input

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Must be a sparse tensor.

Parameters

None

Outputs

out[0]

indices

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32, QNN_DATATYPE_INT_32

  • Shape: [M, K] where 0 < K <= N. M may be dynamically sized and K is fixed.

GetSparseValues

Gets the M specified values from a sparse tensor.

See the CreateSparse op definition for a description of K and partially sparse tensors and a description of the indices tensor.

References:

Inputs

in[0]

input

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N with shape \([D_0,...,D_{N-1}]\)

  • Constraints:

    • Must be a sparse tensor.

Parameters

None

Outputs

out[0]

values

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N-K+1 with shape \([M,D_{K},...,D_{N-1}]\)

  • Constraints:

    • Datatype: Same datatype as in[0]

GridSample

Computes the output using input values and pixel locations from grid. For each output location the grid specifies input pixel locations x and y for 4D case and x, y, and z for 5D case, which are used to interpolate the output value.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D of shape: [batch, height, width, channel] or 5D of shape: [batch, depth, height, width, channel]

  • Constraints:

    • Shape: Rank(in[0]) must equal 4 or 5

in[1]

Grid : specifies the sampling pixel location normalized by the spatial dimensions of in[0]. Therefore, most values should be in the range of [-1, 1]. For example, values x = -1, y = -1 is the left-top pixel of in[0] and x = 1, y = 1 is the right-bottom pixel of in[0]. Note if values are outside the range [-1, 1] then the corresponding outputs will be handled using the method specified by padding_mode.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D of shape: [batch, height_out, width_out, 2] or 5D of shape: [batch, depth_out, height_out, width_out, 3]

  • Constraints:

    • Shape: Same rank as in[0]

Parameters

align_corners

If true, the maximum and minimum (1 and -1) are considered as referring to the center points of in[0] corner pixels. Otherwise, the maximum and minimum refer to the corner points of in[0] corner pixels. Note that pixels of in[0] are considered as squares rather than points.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

mode

Determines the interpolation method. Supported values are 0: BILINEAR, 1: NEAREST. Note when inputs are 5D and mode is set to QNN_OP_GRID_SAMPLE_MODE_BILINEAR the interpolation mode used internally will be trilinear.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • BILINEAR = 0,

    • NEAREST = 1

padding_mode

Determines how the outputs are handled when grid has values outside the range [-1 ,1].

When padding_mode == QNN_OP_GRID_SAMPLE_PADDING_MODE_ZEROS: use 0 for out-of-bound grid locations.

When padding_mode == QNN_OP_GRID_SAMPLE_PADDING_MODE_BORDER: use border values for out-of-bound grid locations.

When padding_mode == QNN_OP_GRID_SAMPLE_PADDING_MODE_REFLECTION: use values reflected by the border for out-of-bound locations. Note for locations far away from the border, it will keep reflecting until the location is in bounds of [-1, 1]. For example, given a pixel location x = -3.5 this will be reflected by the border at -1 and now becomes x’ = 1.5. Now this new pixel location will be reflected by the border at 1 and becomes x’’ = 0.5.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • ZEROS = 0,

    • BORDER = 1,

    • REFLECTION = 2

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D of shape: [batch, height_out, width_out, channel] or 5D of shape: [batch, depth_out, height_out, width_out, channel]

  • Constraints:

    • Shape: Same rank as in[0]

GroupNorm

Applies group normalization to the input tensor. The operation divides the channels into groups and computes within each group the mean (\(\mu\)) and variance (\(\sigma\)) for normalization.

The values in the output tensor are computed as

\[\mbox{out[0]}[b,h,w,c] = \frac{(\mbox{in[0]}[b,h,w,c] - \mu)}{\sqrt{\sigma^2 + \epsilon}} * \gamma + \beta\]

where gamma (\(\gamma\)), beta (\(\beta\)) and epsilon (\(\epsilon\)) are parameters.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N, note that the last dimension in the input is the channel i.e. [.., channel].

  • Constraints:

    • Shape: Rank > 0

    • Shape: channel must be evenly divisible by group

in[1]

gamma (\(\gamma\))

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [channel]

  • Default: {1,..,1}

in[2]

beta (\(\beta\))

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [channel]

  • Default: {0,..,0}

Parameters

epsilon (\(\epsilon\))

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

group

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

Gru

Performs a single Gated Recurrent Unit (GRU) layer.

The GRU operation is described by the following equations:

\[\begin{split}{\Large r_{t}} &= {\Large \sigma(W_{xr} x_{t} + W_{hr} h_{t-1} + b_{xr} + b_{hr})} \\ \\ {\Large z_{t}} &= {\Large \sigma(W_{xz} x_{t} + W_{hz} h_{t-1} + b_{xz} + b_{hz})} \\ \\ {\Large n_{t}} &= \begin{cases} {\Large \phi(W_{xn} x_{t} + (r_{t} \odot h_{t-1}) W_{hn} + b_{xn} + b_{hn})} & {\Large \text{if linear_before_reset = 0;}}, \\ \\ {\Large \phi(W_{xn} x_{t} + (r_{t} \odot (W_{hn} h_{t-1} + b_{hn})) + b_{xn})} & {\Large \text{otherwise.}} \\ \end{cases} \\ \\ {\Large h_{t}} &= {\Large (1 - z_{t}) \odot n_{t} + z_{t} \odot h_{t-1}} \\ \\\end{split}\]

where

\(x_{t}\) is the input,

\(r_{t}\) is the reset gate,

\(z_{t}\) is the update gate,

\(n_{t}\) is the new gate,

\(h_{t}\) is the output state,

\(\sigma\) is the logistic sigmoid function,

\(\phi\) is the tanh activation function,

\(\odot\) is the element-wise product of two vectors.

\(W_{xr}\) is the input-to-reset weight matrix,

\(W_{hr}\) is the recurrent-to-reset weight matrix,

\(b_{xr}\) is the input-reset gate bias,

\(b_{hr}\) is the recurrent-reset gate bias,

\(W_{xz}\) is the input-to-update weight matrix,

\(W_{hz}\) is the recurrent-to-update weight matrix,

\(b_{xz}\) is the input-update gate bias,

\(b_{hz}\) is the recurrent-update gate bias,

\(W_{xn}\) is the input-to-new weight matrix,

\(W_{hn}\) is the recurrent-to-new weight matrix,

\(b_{xn}\) is the input-new gate bias,

\(b_{hn}\) is the recurrent-new gate bias.

The output state is maintained internal to the operation across multiple time-steps and inferences. This internal state can be reset at each inference by reading from in[13]. The decision whether or not to reset the internal state is made based on the value of the reset signal in[14]. If the value of this input is non-zero the internal state is reset from the output state input, or set to all zero values if the optional state input is not connected.

References:

Inputs

in[0]

X : The input \(x_{t}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 3D of shape [seq_length, batch_size, input_size] if time_major equals true or 3D of shape [batch_size, seq_length, input_size] if time_major equals false.

in[1]

input-to-update weights \(W_{xz}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: [hidden_size, input_size]

in[2]

input-to-reset weights \(W_{xr}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: [hidden_size, input_size]

in[3]

input-to-new weights \(W_{xn}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: [hidden_size, input_size]

in[4]

recurrent-to-update weights \(W_{hz}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: [hidden_size, hidden_size]

in[5]

recurrent-to-reset weights \(W_{hr}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: [hidden_size, hidden_size]

in[6]

recurrent-to-new weights \(W_{hn}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: [hidden_size, hidden_size]

in[7]

input-to-update gate bias \(b_{xz}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: [hidden_size]

  • Default: {0,..,0}

in[8]

input-to-reset gate bias \(b_{xr}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: [hidden_size]

  • Default: {0,..,0}

in[9]

input-to-new gate bias \(b_{xn}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: [hidden_size]

  • Default: {0,..,0}

in[10]

recurrent-to-update gate bias \(b_{hz}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: [hidden_size]

  • Default: {0,..,0}

in[11]

recurrent-to-reset gate bias \(b_{hr}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: [hidden_size]

  • Default: {0,..,0}

in[12]

recurrent-to-new gate bias \(b_{hn}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: [hidden_size]

  • Default: {0,..,0}

in[13]

initial_h : Initial value of the output state (\(h_{t}\)). When not specified it is assumed to be 0.

  • Mandatory: false

  • Data type: backend specific

  • Shape: [1, batch_size, hidden_size]

  • Default: {0,..,0}

in[14]

reset : Determines if the internal state should be reset. When set to true the internal state is reset by the input in[13] if it is provided, otherwise it is set to all zero values.

Note that reset is used to indicate the reset of the internal state at the beginning of an inference pass across all batch elements at time-step 0.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: 0D containing scalar value

  • Default: 1

Parameters

direction

Specifies if the RNN is 0 : FORWARD or 1 : REVERSE.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • FORWARD = 0,

    • REVERSE = 1

linear_before_reset

During the computation of the output from the new gate, if linear_before_reset == 0 then the linear transformation is applied before multiplying by the output of the reset gate. Otherwise, the output from the new gate is multiplied by the output of the reset gate before applying the linear transformation.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

time_major

Determines the dimension order of the 3D main input and output. When time_major is true, the 1st dimension of in[0] and out[0] corresponds to seq_length dimension while the 2nd dimension is batch. When time_major is false, the two dimensions are reversed.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

Outputs

out[0]

\(Y\) : A tensor that concatenates all the intermediate output values of the output state (\(h_{t}\)).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 3D of shape [seq_length, batch_size, hidden_size] if time_major equals true or 3D of shape [batch_size, seq_length, hidden_size] if time_major equals false.

out[1]

\(Y_{h}\) : the final output state (\(h_{t}\)) of the input sequence.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [1, batch_size, hidden_size]

HadamardTransform

Performs Hadamard Transform operation on the last dimension of the input tensor.

\[\mbox{out[0]} = \text{HadamardTransform}(\mbox{in[0]}) * \mbox{scale}\]

Inputs

in[0]

Input tensor to be transformed.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

Parameters

scale

Optional scaling factor applied to the output. This is a multiplicative factor applied after the transform.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1.0

Outputs

out[0]

Output tensor after applying the Hadamard Transform and scale.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

HardSwish

The hard swish operation computes:

out[0] = in[0] * max(0, min(6, (x + 3))) / 6

References:

Inputs

in[0]

input activation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

HeatMapMaxKeyPoint

Localize the maximum keypoints from heatmaps.

This operation approximates the accurate maximum keypoint scores and indices after bicubic upscaling by using Taylor expansion up to the quadratic term.

A bounding box is represented by its upper-left corner coordinate (x1,y1) and lower-right corner coordinate (x2,y2) in the original image. A valid bounding box should satisfy x1 <= x2 and y1 <= y2.

References:

Inputs

in[0]

Tensor representing the heatmaps.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D represented as [num_boxes, heatmap_height, heatmap_width, num_keypoints]

in[1]

Bounding boxes

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D represented as [num_boxes, 4], each with format [x1, y1, x2, y2] representing bounding-box coordinates.

Parameters

None

Outputs

out[0]

Keypoint scores

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_boxes, num_keypoints]

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

Keypoint locations

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_boxes, num_keypoints, 2], the second dimension organized as [keypoint_x, keypoint_y]

  • Constraints:

    • Datatype: Same datatype as in[1]

If

If conditional operation.

References:

Inputs

in[0]

Condition indicating which branch to execute.

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape:

Parameters

then_graph

Name of the subgraph to execute when the condition is true.

  • Mandatory: true

  • Data type: QNN_DATATYPE_STRING

  • Shape: scalar

else_graph

Name of the subgraph to execute when the condition is false.

  • Mandatory: false

  • Data type: QNN_DATATYPE_STRING

  • Shape: scalar

  • Default: parameter not used unless set

Outputs

None

Im2Col

Extracts sliding kernel-sized blocks from a batched and multi-channeled input tensor and arranges them into batched columns, one for each iteration of the kernel.

References:

Inputs

in[0]

Input tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel]

Parameters

kernel_size

The size of the sliding block over all channels of the input.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_kernel, width_kernel]

stride

Defines stride for 2D spatial (i.e. height and width) axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Paddings along the height and width dimensions of the input tensor.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[pad_top, pad_bottom], [pad_left, pad_right]]

dilation

Dilation value along each spatial axis (i.e. height and width) of the input.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_dilation, width_dilation]

  • Default: [1, 1]

  • Constraints:

    • Value: Dilations must be > 0

Outputs

out[0]

Output tensor containing the rearranged data

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, channel * height_kernel * width_kernel, L], where L is the number of columns made as a result of the sliding kernel.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: The value of L must be \(\prod_i(\text{floor}(\frac{spatial\_dimensions[i] + 2 * padding[i]- dilation[i] * (kernel\_size[i] -1) - 1}{stride[i]} + 1))\), with spatial_dimensions being the height and width of the input image

ImageProjectionTransform

Applies a projective transform to the image.

This operator produces an output that is the result of transforming the input image based on the following definition, along with necessary interpolation:

For a 3x3 transform matrix expressed as below: [a0, a1, a2, b0, b1, b2, c0, c1, 1 ]

the operation maps the output point (x, y) to a transformed input point (x’, y’) expressed by (x’, y’) = ((a0 x + a1 y + a2) / k, (b0 x + b1 y + b2) / k), where k = c0 x + c1 y + 1

References:

Inputs

in[0]

Input image.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D of shape [batch,height,width,depth]

in[1]

Projective transform matrix.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: containing the non-unity elements of the 3x3 matrix expressed above.

Parameters

interpolation_mode

Determines the interpolation method. Supported values are 0: BILINEAR, 1: NEAREST_NEIGHBOR. Default: BILINEAR

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • BILINEAR = 0,

    • NEAREST_NEIGHBOR = 1

Outputs

out[0]

Transformed image of the Same shape and size as input.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D of shape [batch,height,width,depth]

InstanceNorm

Applies instance normalization to the input tensor.

If mode is MU_SIGMA, region ACROSS_SPATIAL, and normalize_variance is true, values in the output tensor are computed as

\[\mbox{out[0]}[b,h,w,c] = \frac{(\mbox{in[0]}[b,h,w,c] - \mu_{b,c})*\gamma}{\sqrt{\sigma_{b,c}^2 + \epsilon}} + \beta\]

else if mode is MU_SIGMA, region ACROSS_SPATIAL, and normalize_variance is false, values in the output tensor are computed as

\[\mbox{out[0]}[b,h,w,c] = \mbox{in[0]}[b,h,w,c] - \mu_{b,c}\]

where gamma (\(\gamma\)), beta (\(\beta\)), and epsilon (\(\epsilon\)) are parameters and the mean (\(\mu_{b,c}\)) is

\[\mu_{b,c} = \frac{\sum_{h,w}\mbox{in[0]}[b,h,w,c]}{HW}\]

where H and W are

\[H = \mbox{shape}(\mbox{in[0]})[h]\]

and

\[W = \mbox{shape}(\mbox{in[0]})[w]\]

and the variance (\(\sigma_{b,c}^2\)) is

\[\sigma_{b,c}^2 = \frac{\sum_{h,w}(\mbox{in[0]}[b,h,w,c] - \mu_{b,c})^2}{HW} .\]

If mode is RMS with region ACROSS_ALL, values in the output tensor are computed as

\[\mbox{out[0]}[b,h,w,c] = \frac{\mbox{in[0]}[b,h,w,c] *\gamma}{\sqrt{\mbox{RMS}_{b}}} + \beta\]

where \(\mbox{RMS}_{b}\) is

\[\mbox{RMS}_{b} = \sum_{h,w,c}(\mbox{in[0]}[b,h,w,c])^2 + \epsilon.\]

If mode is RMS with region ACROSS_CHANNEL, values in the output tensor are computed as

\[\mbox{out[0]}[b,h,w,c] = \frac{\mbox{in[0]}[b,h,w,c] *\gamma}{\sqrt{\mbox{RMS}_{b,h,w}}} + \beta\]

where \(\mbox{RMS}_{b,h,w}\) is

\[\mbox{RMS}_{b,h,w} = \sum_{c}(\mbox{in[0]}[b,h,w,c])^2 + \epsilon.\]

and gamma (\(\gamma\)), beta (\(\beta\)), and epsilon (\(\epsilon\)) are parameters

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: n-dimensional, note that the last dimension in the input is the channel, […, channel].

  • Constraints:

    • Shape: rank n > 2

in[1]

gamma (\(\gamma\)).

Is applied element-wise across the channel dimension of the mean-subtracted input activation in[0]. This op supports Unidirectional broadcasting from in[1] to in[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: 1D of shape [channel]

in[2]

beta (\(\beta\)).

Is applied element-wise across the channel dimension of the normalized input activation in[0]. This op supports Unidirectional broadcasting from in[2] to in[0].

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [channel]

  • Default: {0,..,0}

Parameters

epsilon (\(\epsilon\))

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1e-12

mode

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • MU_SIGMA = 0,

    • RMS = 1

normalize_variance

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

region

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • ACROSS_SPATIAL = 0,

    • ACROSS_CHANNEL = 1,

    • ACROSS_ALL = 2

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: n-dimensional

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

IsInf

Applies logical IsInf on the input elementwise. Returns true for infinity values detected, otherwise returns false.

\[\begin{split}\mbox{out[0]} = \text{isInf}(\mbox{in[0]}) = \begin{cases} 1 & \text{if(}\ \mbox{in[0]} = \text{inf AND detect_positive = 1)}, \\ 1 & \text{if(}\ \mbox{in[0]} = \text{-inf AND detect_negative = 1)}, \\ 0 & \text{otherwise.} \end{cases}\end{split}\]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16

  • Shape: a tensor of rank N

Parameters

detect_negative

Determines if negative infinity is mapped to true.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

detect_positive

Determines if positive infinity is mapped to true.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Same shape as in[0]

IsNan

Applies logical IsNan on the input elementwise. Returns true for NaN values detected, otherwise returns false.

\[\mbox{out[0]} = \text{isnan}(\mbox{in[0]})\]

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, QNN_DATATYPE_FLOAT_16

  • Shape: a tensor of rank N

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Same shape as in[0]

L2Norm

Applies an L2 normalization on a tensor along the specified axis. For a 1-D tensor with axis = 0, this computes

\[\mbox{out[0]} = \frac{\mbox{in[0]}}{\mbox{max}(||\mbox{in[0]}||_2,\epsilon)}\]

For in[0] with more dimensions, this independently normalizes each 1-D slice along the dimension axis.

References:

Inputs

in[0]

Input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

Scalar index specifying dimension along which to normalize input. Both scalar axis and its tensor counterpart axes are optional, but at least one of them must be provided. However, if axes is non-empty, the value of axis is ignored.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: Must be in range [0,N)

axes

Dimensions along which to normalize input. Each element must be in range [0,N) and must be unique. Both scalar axis and its tensor counterpart axes are optional, but at least one of them must be provided. When both are specified, values of axes take precedence over scalar axis.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Default: {0,..,0}

  • Constraints:

    • Shape: 0 < M <=N

epsilon (\(\epsilon\))

Positive valued, L2 norm lower bound.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1e-12

Outputs

out[0]

Normalized Output

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

L2Pool2d

Performs a 2D L2-pooling on the input activation tensor. The values in the output tensor are computed as:

\[\mbox{out[0]}[b,i,j,c] = \sqrt{\sum_{di,dj}(\mbox{in[0]}[b, \mbox{stride}[0] * i + di, \mbox{stride}[1] * j + dj, c])^2}\]

Pooling is performed over the 2D spatial shape of the input activation tensor, i.e. over it’s [height, width] sub-shape.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel]

Parameters

filter_size

Defines the pool filter size for 2D spatial axes of in[0]. Number of elements to pool from = filter_size[0] * filter_size[1]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [filter_height, filter_width]

stride

Defines the pool stride size for 2D spatial axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 2D spatial axes in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

Outputs

out[0]

output activation

The output 2D spatial dimensions are functions of the filter_size, stride, and pad_amount.

shape(out[0])[height_out] = floor((pad_amount[0,0] + shape(in[0])[height] + pad_amount[0,1] - filter_size[0]) / stride[0] + 1)
shape(out[0])[width_out] = floor((pad_amount[1,0] + shape(in[0])[width] + pad_amount[1,1] - filter_size[1]) / stride[1] + 1)
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height_out, width_out, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

LayerNorm

Applies layer normalization to the input tensor along the specified axes. For a ND input tensor, with a 1D axes parameter of length M, the operation computes a ND mean tensor (\(\mu\)) and ND variance tensor (\(\sigma\)) with shapes moments_shape whose i-th dimension is given by

\[\mbox{shape}(\mbox{moments_shape})[\mbox{i}] = 1\]

if i is in axes and

\[\mbox{shape}(\mbox{moments_shape})[\mbox{i}] = \mbox{shape}(\mbox{in[0]})[\mbox{i}]\]

otherwise. The values in the output tensor are computed as

\[\mbox{out[0]}[b,h,w,c] = \frac{(\mbox{in[0]}[b,h,w,c] - \mu) * \gamma}{\sqrt{\sigma^2 + \epsilon}} + \beta\]

where gamma (\(\gamma\)), beta (\(\beta\)) and epsilon (\(\epsilon\)) are parameters.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

in[1]

gamma (\(\gamma\))

Is applied element-wise across the normalization axes of the mean-subtracted input activation in[0].

This op supports Unidirectional broadcasting from in[1] to in[0]. The i-th dimension of gamma (\(\gamma\)) is given by

\[\mbox{shape}(\gamma)[\mbox{i}] = \mbox{shape}(\mbox{in[0]})[\mbox{axes[i]}]\]
  • Mandatory: false

  • Data type: backend specific

  • Shape: a tensor of rank M, M <= size(axes)

  • Default: {1,..,1}

in[2]

beta (\(\beta\))

Is applied element-wise across the normalization axes of the normalized input activation in[0]. This op supports Unidirectional broadcasting from in[2] to in[0]. The i-th dimension of beta (\(\beta\)) is given by

\[\mbox{shape}(\beta)[\mbox{i}] = \mbox{shape}(\mbox{in[0]})[\mbox{axes[i]}]\]
  • Mandatory: false

  • Data type: backend specific

  • Shape: a tensor of rank M, M <= size(axes)

  • Default: {0,..,0}

Parameters

epsilon (\(\epsilon\))

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.001

axes

A list of dimensions along which to normalize. Each value must be in range [0,N-1] and must be unique.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <= N

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

Logit

Computes the logit (aka the inverse sigmoid) of the input element-wise. When epsilon (\(\epsilon\)) is provided the input is clamped to a range of [\(\epsilon\), 1 - \(\epsilon\)]. Otherwise, input is not clamped.

Note the input domain of this function is (0, 1). Inputs outside this range may result in undefined behavior.

\[\begin{split}\mbox{out[0]} &= \text{ln(}\frac{\text{z}}{\text{z - 1}}\text{)} \\ \text{z} &=\begin{cases} \mbox{in[0]} & \text{if(epsilon ($\epsilon$) not provided)}, \\ \text{max(min(}\mbox{in[0]}\text{, 1 - $\epsilon$), $\epsilon$)} & \text{Otherwise}. \\ \end{cases} \\\end{split}\]

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

in[1]

epsilon (\(\epsilon\)) : Determines the input clamp bound [\(\epsilon\), 1 - \(\epsilon\)]

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: 1D of shape [1]

  • Default: None

  • Constraints:

    • Value: Must be <= 0.5

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Same shape as in[0]

    • Datatype: Same datatype as in[0]

LogSoftmax

Computes the log softmax of the input activations:

out[0] = in[0] * beta - log(reduce_sum(exp(in * beta), axis))

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: must be in range [0, N-1]

beta

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1.0

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension

  • Constraints:

    • Shape: Same shape as in[0]

Lrn

Performs Local Response Normalization on input data over the local regions defined below.

A square sum is computed over a region of size 2R + 1, where R is the radius. Let

\[\begin{split}M_1(x) &= \mbox{max}(0, x - R) \newline \\ M_2(d,x) &= \mbox{min}(\mbox{shape}(\mbox{in[0]})[d], x + R). \newline \\\end{split}\]

Note that N is the rank of in[0]. For across channel LRN, the square sum, S, is computed using this formula:

\[S[n,...,c] = \sum_{j=M_1(c)}^{M_2(N-1,c)}(\mbox{in[0]}[n,...,j])^2.\]

For within channel LRN, the square sum, S, is computed over the spatial dimensions as

\[S[n,h,w,c] = \sum_{i=M_1(h)}^{M_2(1,h)}\sum_{j=M_1(w)}^{M_2(2,w)}(\mbox{in[0]}[n,i,j,c])^2.\]

The output is computed using

\[\mbox{out}[0] = \frac{\mbox{in}[0]}{(B + \alpha S)^\beta}.\]

where B is the bias. Note that some frameworks scale \(\alpha\) based upon a given size. The QNN definition does not scale \(\alpha\).

References:

Inputs

in[0]

Input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

    • Shape: Rank N == 4 for within channel LRN

Parameters

alpha (\(\alpha\))

Scaling parameter

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1

beta (\(\beta\))

Exponent value

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.5

bias

An offset usually positive to avoid division by zero

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1

radius

Radius of the normalization window

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

region

Set to QNN_OP_LRN_REGION_ACROSS_CHANNEL to perform across channel LRN. Set to QNN_OP_LRN_REGION_WITHIN_CHANNEL to perform within channel LRN.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • ACROSS_CHANNEL = 0,

    • WITHIN_CHANNEL = 1

Outputs

out[0]

Normalized Output

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Same shape as in[0]

Lstm

Performs one or more time steps in a Long Short-Term Memory (LSTM) layer.

The LSTM operation is described by the following equations:

\[\begin{split}{\Large i_{t}} &= {\Large \sigma_(W_{xi}x_{t}+W_{hi}h_{t-1}+W_{ci}C_{t-1}+b_i)} \\ \\ {\Large f_{t}} &= {\Large \sigma_(W_{xf}x_{t}+W_{hf}h_{t-1}+W_{cf}C_{t-1}+b_f)} \\ \\ {\Large C_{t}} &= {\Large clip(f_{t} \odot C_{t-1} + i_{t} \odot \phi(W_{xc}x_{t}+W_{hc}h_{t-1}+b_c),\ t_{cell})} \\ \\ {\Large o_{t}} &= {\Large \sigma_(W_{xo}x_{t}+W_{ho}h_{t-1}+W_{co}C_{t}+b_o)} \\ \\ {\Large h_{t}} &= \begin{cases} {\Large clip(W_{proj}(o_t \odot \phi(C_t))+b_{proj},\ t_{proj})} & {\Large \text{if there is a projection;}} \\ \\ {\Large o_{t} \odot \phi(C_{t})} & {\Large \text{otherwise.}} \\ \end{cases} \\ \\\end{split}\]

where

\(x_{t}\) is the input,

\(i_{t}\) is the input gate,

\(f_{t}\) is the forget gate,

\(C_{t}\) is the cell state,

\(o_{t}\) is the output gate,

\(h_{t}\) is the output state,

\(\sigma\) is the logistic sigmoid function,

\(\phi\) is the tanh activation function,

\(W_{xi}\) is the input-to-input weight matrix,

\(W_{hi}\) is the recurrent-to-input weight matrix,

\(W_{ci}\) is the cell-to-input weight matrix,

\(b_i\) is the input gate bias,

\(W_{xf}\) is the input-to-forget weight matrix,

\(W_{hf}\) is the recurrent-to-forget weight matrix,

\(W_{cf}\) is the cell-to-forget weight matrix,

\(b_f\) is the forget gate bias,

\(W_{xc}\) is the input-to-cell weight matrix,

\(W_{hc}\) is the recurrent-to-cell weight matrix,

\(b_c\) is the cell bias,

\(W_{xo}\) is the input-to-output weight matrix,

\(W_{ho}\) is the recurrent-to-output weight matrix,

\(W_{co}\) is the cell-to-output weight matrix,

\(b_o\) is the output gate bias,

\(W_{proj}\) is the projection weight matrix,

\(b_{proj}\) is the projection bias,

\(t_{cell}\) is the threshold for clipping the cell state, and

\(t_{proj}\) is the threshold for clipping the projected output.

\(\odot\) is the element-wise product of two vectors.

The operation can be stateful or stateless depending on the dimensionality of the \(x_{t}\) input. If the input is 2D, the operation is stateless and a single time-step is executed. The initial output state and cell state are read from in[10] and in[11] respectively.

If the input is 3D, the 2nd dimension represents the number of time-steps to be executed if time_major is false. If time_major is true, then the 1st dimension of the input represents the time-steps and the 2nd dimension is the batch dimension. The output state and cell state are maintained internal to the operation across multiple time-steps and inferences. These internal states can be reset at each inference by reading from in[10] and in[11] respectively. The decision whether or not to reset the internal state is made based on the value of the reset signal in[24]. If the value of this input is non-zero the internal state is reset from the state inputs, or set to all zero values if the optional state inputs are not connected.

The operation has the following independently optional inputs:

  • The cell-to-input weights (\(W_{ci}\)), cell-to-forget weights (\(W_{cf}\)) and cell-to-output weights (\(W_{co}\)) either all have values or neither of them have values (i.e., all set to null). If they have values, the peephole optimization is used.

  • The input-to-input weights (\(W_{xi}\)), recurrent-to-input weights (\(W_{hi}\)) and input gate bias (\(b_i\)) either all have values, or none of them have values. If they have no values, coupling of input and forget gates (CIFG) is used, in which case the input gate (\(i_{t}\)) is calculated using the following equation instead:

\[{\Large i_{t} = 1 - f_{t}}\]

In case peephole optimization is used and CIFG is not used cell-to-input (\(W_{ci}\)) weights must be present. Otherwise, the cell-to-input weights must have no value.

  • The projection weights (\(W_{proj}\)) is required only for the recurrent projection layer, and should otherwise have no value.

  • The projection bias (\(b_{proj}\)) may (but not required to) have a value if the recurrent projection layer exists, and should otherwise have no value.

  • The four layer normalization weights either all have values or none of them have values. Additionally, if CIFG is used, input layer normalization weights tensor is omitted and the other layer normalization weights either all have values or none of them have values. Layer normalization is used when the values of all the layer normalization weights are present.

References:

Inputs

in[0]

The input \(x_{t}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, input_size] or 3D of shape [batch_size, time_steps, input_size] if time_major equals false or 3D of shape [time_steps, batch_size, input_size] if time_major equals true.

  • Constraints:

    • Shape: Rank(in[0]) must equal 2 or 3

in[1]

input-to-forget weights \(W_{xf}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [num_units, input_size]

in[2]

input-to-cell weights \(W_{xc}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [num_units, input_size]

in[3]

input-to-output weights \(W_{xo}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [num_units, input_size]

in[4]

recurrent-to-forget weights \(W_{hf}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [num_units, output_size]

in[5]

recurrent-to-cell weights \(W_{hc}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [num_units, output_size]

in[6]

recurrent-to-output weights \(W_{ho}\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [num_units, output_size]

in[7]

forget gate bias \(b_f\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[8]

cell bias \(b_c\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[9]

output gate bias \(b_o\).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[10]

output state (in) \(h_{t-1}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, output_size]

  • Default: {0,..,0}

in[11]

cell state (in) \(C_{t-1}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, num_units]

  • Default: {0,..,0}

in[12]

The input layer normalization weights. Used to rescale normalized inputs to activation at input gate.

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[13]

The forget layer normalization weights. Used to rescale normalized inputs to activation at forget gate.

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[14]

The cell layer normalization weights. Used to rescale normalized inputs to activation at cell gate.

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[15]

The output layer normalization weights. Used to rescale normalized inputs to activation at output gate.

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[16]

input-to-input weights \(W_{xi}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [num_units, input_size], where “num_units” corresponds to the number of cell units.

in[17]

recurrent-to-input weights \(W_{hi}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [num_units, output_size] where “output_size” corresponds to either the number of cell units (i.e., “num_units”), or the first dimension of the “projection_weights”, if defined.

in[18]

cell-to-input weights \(W_{ci}\). It is a diagonal matrix by definition, and is expressed as a 1D vector.

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[19]

cell-to-forget weights \(W_{cf}\). It is a diagonal matrix by definition, and is expressed as a 1D vector.

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[20]

cell-to-output weights \(W_{co}\). It is a diagonal matrix by definition, and is expressed as a 1D vector.

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[21]

input gate bias \(b_i\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [num_units]

in[22]

projection weights \(W_{proj}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [output_size, num_units]

in[23]

projection bias \(b_{proj}\).

  • Mandatory: false

  • Data type: backend specific

  • Shape: 1D of shape [output_size]

in[24]

reset : Determines if the internal state should be reset. When set to true the internal states are reset by the inputs in[10] and in[11] if they are provided, otherwise they are set to all zero values.

Note that reset is only applicable to a 3D input \(x_{t}\) and used to indicate the reset of the internal states at the beginning of an inference pass across all batch elements at time-step 0.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: 0D containing scalar value

  • Default: 1

Parameters

direction

The ‘direction’ of computation for the LSTM op. Used to achieve the functionality of Bi-directional LSTM by using a combination of individual LSTM ops configured in opposite directions.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • FORWARD = 0,

    • REVERSE = 1

cell_clip_threshold

The clipping threshold (\(t_{cell}\)) for the cell state, such that values are bound within [-cell_clip, cell_clip]. If set to 0.0 clipping is disabled.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

output_clip_threshold

The clipping threshold (\(t_{proj}\)) for the output from the projection layer, such that values are bound within [-proj_clip, proj_clip]. If set to 0.0 clipping is disabled.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

time_major

Determines the dimension order of the 3D main input and output. When equal to true, the 1st dimension of in[0] and out[0] corresponds to time_step dimension while the 2nd dimension is batch. When time_major is false, the two dimensions are reversed.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

input_gate_qscale

The quantization scale of the intermediate result of matrix multiplication, i.e input to layer normalization at input gate \(i_{t}\) realized by the following expression:

\((W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}C_{t-1})\)

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

forget_gate_qscale

The quantization scale of the intermediate result of matrix multiplication, i.e input to layer normalization at forget gate \(f_{t}\) realized by the following expression:

\((W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}C_{t-1})\)

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

cell_gate_qscale

The quantization scale of the intermediate result of matrix multiplication, i.e input to layer normalization at cell gate \(C_{t}\) realized by the following expression:

\((W_{xc}x_{t} + W_{hc}h_{t-1})\)

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

output_gate_qscale

The quantization scale of the intermediate result of matrix multiplication, i.e input to layer normalization at output gate \(o_{t}\) realized by the following expression:

\((W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}C_{t})\)

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

hidden_state_offset

The quantization offset of the hidden state \(h_{t}\), i.e. input to projection realized by the following expression: \((o_t \odot \phi(C_t))\)

  • Mandatory: false

  • Data type: backend specific

  • Shape: scalar

  • Default: 0.0

hidden_state_qscale

The quantization scale of the hidden state \(h_{t}\), i.e. input to projection realized by the following expression: \((o_t \odot \phi(C_t))\)

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

Outputs

out[0]

output state (out) (\(h_{t}\)).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, output_size] or 3D of shape [batch_size, time_steps, output_size] if time_major equals false, or 3D of shape [time_steps, batch_size, output_size] if time_major equals true.

  • Constraints:

    • Shape: Same Rank as in[0]

out[1]

cell state (out) (\(C_{t}\)).

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, num_units]

out[2]

output (\(o_{t}\)) : This is the the current “output state (out)” value. If out[0] is 2D, it is identical to out[0]. If out[0] is 3D, it contains the values of the final time-step.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, output_size].

out[3]

input_gate

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, num_units].

out[4]

forget_gate

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, num_units].

out[5]

cell_gate

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, num_units].

out[6]

output_gate

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, output_size].

out[7]

hidden_state

  • Mandatory: false

  • Data type: backend specific

  • Shape: 2D of shape [batch_size, output_size].

MaskedSoftmax

Applies a Softmax operation on masked portions of the input tensor. For each batch the mask tensor is broadcast on the input before softmax computation. A mask tensor must be provided in either an UNCOMPRESSED or COMPRESSED format depending on the mode selected. See in[1] for details on how a boolean mask can be converted to an UNCOMPRESSED or COMPRESSED mask tensor.

Inputs

in[0]

Input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D of shape [batch, height, width, channel]

  • Constraints:

    • Shape: When mode is set to COMPRESSED width == channel.

in[1]

Mask tensor: The representation of this 2D tensor is determined by the mode selected. When mode is set to UNCOMPRESSED M = channel or set to COMPRESSED M = number of sequences.

Consider a boolean mask where a mask value of 1 indicates the dimension on which Softmax should be performed and a mask value of 0 indicates the dimension that Softmax will not be performed.

An uncompressed mask can be made from a boolean mask tensor by adding -1 or subtracting by 1 element-wise and multiplying the intermediate result element-wise by a large value.

mask = [[1,1,1,0,1]]
uncompressed_mask = (mask .+ -1) .* 10000
// uncompressed_mask = [[0,0,0,-10000,0]]

A compressed mask can be made from multiple boolean mask tensors of vector lengths that are concatenated into a single batch and summed across the 2nd axis.

For Example:

Let there be 3 mask tensors that correspond to sequences of inputs that were used to make in[0] where 0’s represent where padding was added to make them the max sequence length.

mask1 = [1,0,0,0]
mask2 = [1,1,1,0]
mask3 = [1,1,1,1]

The concatenated mask would then be the following:

concatenated_mask = [
[1,0,0,0],
[1,1,1,0],
[1,1,1,1]]

The compressed mask representation would be made from summing across the 2nd axis:

compressed_mask = [[1,3,4]]
  • Mandatory: true

  • Data type: backend specific

  • Shape: 2D of shape [batch, M]

  • Constraints:

    • Value: When mode is set to COMPRESSED the sum of values in each batch must be <= channel.

Parameters

mode

Determines the MaskedSoftmax Operation that is performed. See in[1] for details on mask tensor format for each of the modes.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • UNCOMPRESSED = 0,

    • COMPRESSED = 1

Outputs

out[0]

Output Activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D of shape [batch, height, width, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

MatMul

Performs a batched matrix multiplication on the innermost two dimensions of the operands in[0] and in[1]. Batch dimensions of in[0] and in[1] will be broadcast when possible. To be broadcast, the smaller dimension must be equal to 1, or omitted if the ranks of the input tensors differ. Batch dimensions of out[0] must match the batch dimensions of in[0] and in[1] after being broadcast. For example, if in[0] is of shape [10, 256, 128] and in[1] is of shape [5, 1, 128, 64], the input batch dimensions are [10] and [5, 1] respectively, and will be broadcast to [5, 10]. out[0] will be of shape [5, 10, 256, 64].

Let

\[A = \mbox{in}[0], B = \mbox{in}[1], C = \mbox{out}[0], b = \mbox{in}[2].\]

When \(\mbox{transpose_in0} = 0\) and \(\mbox{transpose_in1} = 0\), the operation satisfies

\[C[..., i, j] = \sum^{m1}_{k=0}(A[..., i, k] * B[..., k, j] + b[j]), \forall i \in [0,m0], \forall j \in [0,n1].\]

When \(\mbox{transpose_in0} = 0\) and \(\mbox{transpose_in1} = 1\), the operation satisfies

\[C[..., i, j] = \sum^{m1}_{k=0}(A[..., i, k] * B[..., j, k] + b[j]), \forall i \in [0,m0], \forall j \in [0,m1].\]

When \(\mbox{transpose_in0} = 1\) and \(\mbox{transpose_in1} = 0\), the operation satisfies

\[C[..., i, j] = \sum^{n1}_{k=0}(A[..., k, i] * B[..., k, j] + b[j]), \forall i \in [0,n0], \forall j \in [0,n1].\]

When \(\mbox{transpose_in0} = 1\) and \(\mbox{transpose_in1} = 1\), the operation satisfies

\[C[..., i, j] = \sum^{n1}_{k=0}(A[..., k, i] * B[..., j, k] + b[j]), \forall i \in [0,n0], \forall j \in [0,m1].\]

Refer to MatMul backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Matrix operand: A

  • Mandatory: true

  • Data type: backend specific

  • Shape: […, m0, n0] of rank N >= 2

  • Dynamic Shape: All dimensions can be dynamic.

in[1]

Matrix operand: B

When transpose_in0 = 0 and transpose_in1 = 0: m1 = n0
When transpose_in0 = 0 and transpose_in1 = 1: n1 = n0
When transpose_in0 = 1 and transpose_in1 = 0: m1 = m0
When transpose_in0 = 1 and transpose_in1 = 1: n1 = m0
  • Mandatory: true

  • Data type: backend specific

  • Shape: […, m1, n1] of rank N >= 2

  • Dynamic Shape: All dimensions can be dynamic.

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [n2]

  • Dynamic Shape: n2 can be dynamic.

  • Default: {0,..,0}

Parameters

transpose_in0

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

transpose_in1

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Matrix product: C

When transpose_in0 = 0 and transpose_in1 = 0: m2 = m0 and n2 = n1
When transpose_in0 = 0 and transpose_in1 = 1: m2 = m0 and n2 = m1
When transpose_in0 = 1 and transpose_in1 = 0: m2 = n0 and n2 = n1
When transpose_in0 = 1 and transpose_in1 = 1: m2 = n0 and n2 = m1
  • Mandatory: true

  • Data type: backend specific

  • Shape: […, m2, n2] of rank N >= 2

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Dynamic Shape: For each batch dimension, if the corresponding dimension for either in[0] or in[1] is dynamic, it too must be dynamic. If transpose_in0 = 0 and m0 is dynamic or if transpose_in0 = 1 and n0 is dynamic, m2 must be dynamic. If transpose_in1 = 0 and n1 is dynamic or if transpose_in1 = 1 and m1 is dynamic, n2 must be dynamic.

Moments

Calculates the mean and variance of an input tensor.

See Moments backend definition for supported datatypes and constraints per backend

References:

Inputs

in[0]

Input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: ND, of length <= rank(in[0])

  • Constraints:

    • Shape: length <= rank(in[0])

    • Shape: Rank > 0

Parameters

axes

Axes along which to compute mean and variance

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: 1D. For batch normalization: Tensor of 1 element (batch only), or 3 elements for filters with shape [batch, height, width, depth] (e.g. pass axes=[0, 1, 2]).

keep_dims

Produce moments with the same dimensionality as the input. Set to either QNN_OP_MOMENTS_KEEP_DIM true or false.

  • Mandatory: true

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

Outputs

out[0]

Mean

  • Mandatory: true

  • Data type: backend specific

  • Shape: If keepdims = QNN_OP_MOMENTS_KEEP_DIMS true, specified axes dims of in[0] become 1, others are same as in[0]. For keepdims = QNN_OP_MOMENTS_KEEP_DIMS false, specified axes of in[0] are omitted and others are as in[0].

  • Constraints:

    • Shape: Rank > 0

out[1]

Variance

  • Mandatory: true

  • Data type: backend specific

  • Shape: If keepdims = QNN_OP_MOMENTS_KEEP_DIMS true, specified axes dims of in[0] become 1, others are same as in[0]. For keepdims = QNN_OP_MOMENTS_KEEP_DIMS false, specified axes of in[0] are omitted and others are as in[0]

  • Constraints:

    • Shape: Rank > 0

MultiClassNms

Filter bounding boxes across multiple classes in descending order of score using Non maximum suppression. Boxes that have a high IOU overlap with previously selected boxes are pruned. Additionally, bounding boxes with scores less than a specific threshold are pruned. The op also performs gather operation on any additional features corresponding to the filtered boxes.

References:

Inputs

in[0]

Bounding boxes. Elements can be understood as 4-tuples of bounding box coordinates given in the form (y1,x1,y2,x2).

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_boxes, 4]

in[1]

Bounding box scores. The element at position [batch, box, class] is the score corresponding to class for the bounding box at the position [batch, box] in in[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_boxes, num_classes]

in[2..m]

Additional feature vectors where m >=2. These features are also ‘filtered’ by performing ‘gather’ operations along the same indices as the filtered boxes.

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: false

  • Data type: backend specific

  • Shape: N-dimensional. dim[0] should match batch; dim[1] should match num_boxes.

  • Constraints:

    • Number: m >= 2

    • Shape: Rank > 0

Parameters

iou_threshold

IoU threshold for the NMS algorithm.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

score_threshold

Boxes with scores lower than the threshold are filtered by the NMS algorithm.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

soft_nms_sigma

Sigma parameter for Soft NMS. When this param is set to a non-zero value, boxes reduce the score of other overlapping boxes instead of directly causing them to be pruned. When the value of the param is 0, the NMS algorithm defaults to the hard version.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

Outputs

out[0]

Selected Output boxes. Each element can be understood as a 4-tuple with the same meaning as in[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, max_num_boxes, 4]. max_num_boxes is expressed by maxDimensions in Qnn_Tensor_t. The number of valid boxes is specified by out[3].

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

Selected Output box scores. Gives the score for the box in the corresponding position in out[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, max_num_boxes]. max_num_boxes is expressed by maxDimensions in Qnn_Tensor_t. The number of valid scores is specified by out[3].

  • Constraints:

    • Datatype: Same datatype as in[1]

out[2]

Selected Output classes. Gives the class label for the box in the corresponding position in out[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [batch, max_num_boxes]. max_num_boxes is expressed by maxDimensions in Qnn_Tensor_t. The number of valid classes is specified by out[3].

out[3]

Number of valid boxes per batch element that remain after NMS.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [batch]

  • Constraints:

    • Value: <= max_num_boxes

out[4..M]

Selected features post NMS. They are filtered along axis 0 that matches the dimensions of the boxes. max_num_boxes is expressed by maxDimensions in Qnn_Tensor_t. The number of valid classes is specified by out[3]. The number of these output features must match the number of input features in in[2..m].

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: false

  • Data type: backend specific

  • Shape: N-dimensional. dim[0] should match batch; dim[1] should match max_num_boxes.

  • Constraints:

    • Datatype: Same datatypes as in[2..m]

    • Number: M >= 4

    • Shape: Rank > 0

    • Value: (M - 3) must be equal to (m - 1)

NonMaxSuppression

Filters out boxes that have high intersection-over-union (IOU) overlap with previously selected boxes. Bounding boxes with a score less than score_threshold are removed. Note when the number of valid boxes detected is less than max_selected_indices, out[0] is padded with the indices of the detected box containing the lowest score. If no valid boxes are detected then the output is padded with the indices to the first box.

References:

Inputs

in[0]

Bounding boxes: Elements can be understood as 4-tuples of bounding box coordinates given in the form (y1,x1,y2,x2), where (y1, x1) and (y2, x2) represent a diagonal pair of corners.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_boxes, 4]

in[1]

Bounding box scores.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, num_classes, num_boxes]

Parameters

iou_threshold

Represents the threshold used by NMS algorithm to determine whether boxes overlap too much.

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be in range [0, 1]

score_threshold

Boxes with scores lower than the threshold are filtered out by the NMS algorithm.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

max_boxes_selected

Maximum number of boxes that can be selected per batch per class. Note that default 0 means there is no valid output boxes. Note that if value provided is greater than num_boxes it is set to the value of num_boxes.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

Outputs

out[0]

selected_indices: Indices of the elements that have been kept by NMS algorithm. Each element can be understood as a 3-tuple with index format (batch_index, class_index, box_index).

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [max_selected_indices, 3]. max_selected_indices = batch * num_classes * max_boxes_selected, where the valid number of selected indices is specified by out[1].

out[1]

Valid number of selected indices per batch after NMS.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [batch]

  • Constraints:

    • Value: Must be <= max_selected_indices

NonZero

Generates a 2D output tensor containing the indices to nonzero elements of in[0].

Note this operator provides indices of all detected nonzero elements in the initial portion of the output tensor. The remaining elements of the output tensor are populated with a fill value of -1.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

Indices to nonzero elements of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32

  • Shape: [M, N], where M is the total number of elements of in[0].

Nv12ToRgb

Transform Nv12 to RGB (or BRG).

Inputs

in[0]

Input tensor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [b, w*h*3/2] where w and h are the width and height of the image.

Parameters

output_order

Controls the order of the output tensor. Set to QNN_OP_NV12_TO_RGB_OUTPUT_ORDER_RGB the output will be in RGB. Set to QNN_OP_NV12_TO_RGB_OUTPUT_ORDER_BGR, the output will be BGR.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Values:

    • RGB = 0,

    • BGR = 1

Outputs

out[0]

Output tensor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [b,h,w,3] where w and h are the width and height of the image.

Nv21ToRgb

Transform Nv21 to RGB (or BRG).

Inputs

in[0]

Input tensor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [b, w*h*3/2] where w and h are the width and height of the image.

Parameters

output_order

Controls the order of the output tensor. Set to QNN_OP_NV21_TO_RGB_OUTPUT_ORDER_RGB the output will be in RGB. Set to QNN_OP_NV21_TO_RGB_OUTPUT_ORDER_BGR, the output will be BGR.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Values:

    • RGB = 0,

    • BGR = 1

Outputs

out[0]

Output tensor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [b,h,w,3] where w and h are the width and height of the image.

OneHot

Creates a one-hot encoded tensor. Locations in indices will take on QNN_OP_ONE_HOT_PARAM_ON_VALUE while all other locations take on QNN_OP_ONE_HOT_PARAM_OFF_VALUE. Depth of one-hot locations can be specified with QNN_OP_ONE_HOT_PARAM_DEPTH.

References:

Inputs

in[0]

Indices of one hot encoding

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32, QNN_DATATYPE_INT_32

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

    • Value: Index values must be in range [0, QNN_OP_ONE_HOT_PARAM_DEPTH-1]

Parameters

depth

Depth of the one-hot dimension.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

axis

The axis to fill the one-hot dimension.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: N

  • Constraints:

    • Value: in range [0, N]

on_value

The value to fill at the indices provided by in[0].

  • Mandatory: false

  • Data type: backend specific

  • Shape: scalar

  • Default: 1

off_value

The value to fill at all other indices not provided in in[0].

  • Mandatory: false

  • Data type: backend specific

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Datatype: Same datatype as QNN_OP_ONE_HOT_PARAM_ON_VALUE

Outputs

out[0]

Output one-hot encoded tensor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: Shape is the same as in[0], but with rank(out[0]) = N + 1. A dimension of QNN_OP_ONE_HOT_PARAM_DEPTH inserted at the axis specified by QNN_OP_ONE_HOT_PARAM_AXIS.

  • Constraints:

    • Datatype: Same datatype as QNN_OP_ONE_HOT_PARAM_ON_VALUE

Pack

Packs the list of same rank tensors into a tensor with rank one higher than each tensor by packing them along the axis dimension.

References:

Inputs

in[0..m]

input tensors. m >=1

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional. Must be the same for all inputs.

  • Constraints:

    • Number: m >= 1

    • Shape: Rank > 0

Parameters

axis

Dimension along which packing will happen

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Value <= (rank(in[0]))

Outputs

out[0]

The packed output tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape:

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: rank(out[0]) = rank(in[0]) + 1

Pad

Pads input tensor with the appropriate value based on the scheme picked. Pad amount is a tensor with shape [N, 2] where N is the rank of the input. Pad amount [i, 0] identifies pad size to add before on dimension i and [i, 1] identifies pad size to add after on dimension i. The output tensor on that dimension will have a size equal to the input tensor plus the padding before and after.

shape(out[0])[i] = pad_amount[i, 0] + shape(in[0])[i] + pad_amount[i, 1]

The pad value is applicable when scheme used is CONSTANT. Client is responsible to provide a value that is appropriate to the operation consuming the padded input. Pad value will be ignored for other schemes.

Mirror padding scheme uses tensor data to fill the pad amount. The padded region does not use the border value when using MIRROR_REFLECT whereas it uses the border value when using MIRROR_SYMMETRIC.

For MIRROR_REFLECT, the before and after pad amounts must not be greater than shape(in[0])[i] - 1. For MIRROR_SYMMETRIC, the before and after pad amounts must not be greater than shape(in[0])[i].

Edge padding scheme applies padding with the edge values of the tensor in each dimension.

Refer Pad backend definition for supported data type and layouts for each backend.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

scheme

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Values:

    • CONSTANT = 0,

    • MIRROR_SYMMETRIC = 1,

    • MIRROR_REFLECT = 2,

    • EDGE = 3

pad_amount

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [rank(in[0]), 2]

pad_constant_value

  • Mandatory: false

  • Data type: backend specific

  • Shape: scalar

  • Default: 0

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same rank as in[0]

PoolAvg2d

Performs 2D average pooling on the input activation tensor by averaging a subset of the input tensor values according to the filter_size and stride, effectively downsampling the input data into the output activation tensor.

Average pooling is performed over 2D spatial shape of the input activation tensor, i.e. over it’s [height, width] sub-shape.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel]

Parameters

filter_size

Defines filter size for 2D spatial axes of in[0]. Number of elements to average = filter_size[0]*filter_size[1]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [filter_height, filter_width]

stride

Defines stride for 2D spatial axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 2D spatial axes of in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

count_pad_for_edges

Include pad elements when calculating average for the edges. 0 = do not include pad. Any other value will include padding into average calculation.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

rounding_mode

Indicate the rounding mode used in truncating output dimensions to integer values. Available options: 0: FLOOR, 1: CEIL.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • FLOOR = 0,

    • CEIL = 1

Outputs

out[0]

output activation

The output 2D spatial dimensions are functions of the filter_size, stride, pad_amount and rounding_mode.

shape(out[0])[height_out] = ROUND((pad_amount[0,0] + shape(in[0])[height] + pad_amount[0,1] - filter_size[0]) / stride[0] + 1)
shape(out[0])[width_out] = ROUND((pad_amount[1,0] + shape(in[0])[width] + pad_amount[1,1] - filter_size[1]) / stride[1] + 1)
where ROUND = floor() or ceil(), based on rounding_mode.
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height_out, width_out, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

PoolAvg3d

Performs 3D average pooling on the input activation tensor by averaging a subset of the input tensor values according to the filter_size and stride, effectively downsampling the input data into the output activation tensor.

Average pooling is performed over 3D spatial shape of the input activation tensor, i.e. over it’s [depth, height, width] sub-shape.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth, height, width, channel]

Parameters

filter_size

Defines filter size for 3D spatial axes of in[0]. Number of elements to average = filter_size[0] * filter_size[1] * filter_size[2]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [filter_depth, filter_height, filter_width]

stride

Defines stride for 3D spatial axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [depth_stride, height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 3D spatial axes of in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3,2] with format [[depth_pad_before, depth_pad_after], [height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

count_pad_for_edges

Include pad elements when calculating average for the edges. 0 = do not include pad. Any other value will include padding into average calculation.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

rounding_mode

Indicate the rounding mode used in truncating output dimensions to integer values. Available options: 0: FLOOR, 1: CEIL.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • FLOOR = 0,

    • CEIL = 1

Outputs

out[0]

output activation : The output 3D spatial dimensions are functions of the filter_size, stride, pad_amount and rounding_mode.

depth_out = ROUND((pad_amount[0,0] + Shape(in[0])[depth] + pad_amount[0,1] - filter_size[0]) / stride[0] + 1)
height_out = ROUND((pad_amount[1,0] + Shape(in[0])[height] + pad_amount[1,1] - filter_size[1]) / stride[1] + 1)
width_out = ROUND((pad_amount[2,0] + Shape(in[0])[width] + pad_amount[2,1] - filter_size[2]) / stride[2] + 1)
where ROUND = floor() or ceil(), based on *rounding_mode*.
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth_out, height_out, width_out, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

PoolMax2d

Performs 2D maximum pooling on the input activation tensor by computing maximum value in a subset of the input tensor values according to the filter_size and stride, effectively downsampling the input data into the output activation tensor. Maximum pooling is performed over 2D spatial shape of the input activation tensor, i.e. over it’s [height, width] sub-shape.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel]

Parameters

filter_size

Defines max pool filter size for 2D spatial axes of in[0]. Number of elements to pool from = filter_size[0] * filter_size[1]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [filter_height, filter_width]

stride

Defines max pool stride size for 2D spatial axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 2D spatial axes of in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

rounding_mode

Indicate the rounding mode used in truncating output dimensions to integer values. Available options: 0: FLOOR, 1: CEIL.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • FLOOR = 0,

    • CEIL = 1

Outputs

out[0]

output activation

The output 2D spatial dimensions are functions of the filter_size, stride, pad_amount and rounding_mode.

shape(out[0])[height_out] = ROUND((pad_amount[0,0] + shape(in[0])[height] + pad_amount[0,1] - filter_size[0]) / stride[0] + 1)
shape(out[0])[width_out] = ROUND((pad_amount[1,0] + shape(in[0])[width] + pad_amount[1,1] - filter_size[1]) / stride[1] + 1)
where ROUND = floor() or ceil(), based on rounding_mode.
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height_out, width_out, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

PoolMax3d

Performs 3D maximum pooling on the input activation tensor by computing maximum value in a subset of the input tensor values according to the filter_size and stride, effectively downsampling the input data into the output activation tensor. Maximum pooling is performed over 3D spatial shape of the input activation tensor, i.e. over it’s [depth, height, width] sub-shape.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth, height, width, channel]

Parameters

filter_size

Defines max pool filter size for 3D spatial axes of in[0]. Number of elements to pool from = filter_size[0] * filter_size[1] * filter_size[2]

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [filter_depth, filter_height, filter_width]

stride

Defines max pool stride size for 3D spatial axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [depth_stride, height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 3D spatial axes of in[0]. Pad value = 0.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3,2] with format [[depth_pad_before, depth_pad_after], [height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

rounding_mode

Indicate the rounding mode used in truncating output dimensions to integer values. Available options: 0: FLOOR, 1: CEIL.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • FLOOR = 0,

    • CEIL = 1

Outputs

out[0]

output activation

The output 3D spatial dimensions are functions of the filter_size, stride, pad_amount, and rounding_mode.

shape(out[0])[depth_out] = ROUND((pad_amount[0,0] + shape(in[0])[depth] + pad_amount[0,1] - filter_size[0]) / stride[0] + 1)
shape(out[0])[height_out] = ROUND((pad_amount[1,0] + shape(in[0])[height] + pad_amount[1,1] - filter_size[1]) / stride[1] + 1)
shape(out[0])[width_out] = ROUND((pad_amount[2,0] + shape(in[0])[width] + pad_amount[2,1] - filter_size[2]) / stride[2] + 1)
where ROUND = floor() or ceil(), based on rounding_mode.
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth_out, height_out, width_out, channel]

  • Constraints:

    • Datatype: Same datatype as in[0]

Prelu

The Parametric rectified linear unit operation computes:

out[0] = in[1] * in[0] for in[0] < 0
out[0] = in[0] for in[0] >= 0

The coefficient tensor in[1] is applied element-wise to input activation tensor in[0]. Only unidirectional broadcasting is possible, from in[1] to in[0].

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

in[1]

coefficients

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M, where 0 < M <= N

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

Quantize

Implements:

output = round(input / scale) + offset

where output is limited to out[0]’s data type maximum and minimum.

Note that scale and offset are determined from out[0].

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

  • Mandatory: true

  • Data type: QNN_DATATYPE_SFIXED_POINT_4, QNN_DATATYPE_UFIXED_POINT_4, QNN_DATATYPE_SFIXED_POINT_8, QNN_DATATYPE_UFIXED_POINT_8, QNN_DATATYPE_SFIXED_POINT_16, QNN_DATATYPE_UFIXED_POINT_16, QNN_DATATYPE_SFIXED_POINT_32, QNN_DATATYPE_UFIXED_POINT_32, backend specific

  • Shape: Any

  • Constraints:

    • Shape: Same shape as in[0]

RandomUniformLike

Generate a tensor with random values drawn from a uniform distribution. If the output tensor has dynamic dimensions, the shape of the output tensor is determined by the input shape tensor. The parameters of the uniform distribution are specified by low and high.

Note random values drawn from uniform distribution may vary across backends. Refer to RandomUniformLike backend definition per backend support for dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Input tensor : a 1D tensor specifying the shape of the expected output tensor. The value of in[0] is not used unless out[0] is of dynamic shape.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [N]

  • Constraints:

    • If out[0] has dynamic dimensions, in[0] must be present. Otherwise, in[0] is ignored.

    • Shape: Rank > 0

in[1]

Seed to the random generator, if not specified will auto generate one.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: 1D of shape [1]

  • Default: None

Parameters

low

Lower boundary of the output values.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 0.0

  • Constraints:

    • Value: Must be less than high

high

Upper boundary of the output values.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1.0

  • Constraints:

    • Value: Must be greater than low

Outputs

out[0]

Output tensor with shape specified by in[0] and random values drawn from a uniform distribution.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

ReduceMax

Reduces a tensor by computing the maximum of elements along given dimensions as specified in axes. If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in axes.

References:

Inputs

in[0]

Input activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional

  • Constraints:

    • Shape: Rank > 0

Parameters

axes

A list of dimensions along which to reduce. Each value must be in range [0,N-1] and must be listed only once.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <= N

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: K-dimensional, where K = N if keep_dims is true and K = max(1, N - M) otherwise K-dimensional, where K = N if keep_dims is true and K = max(1, N - M) otherwise

  • Constraints:

    • Datatype: Same datatype as in[0]

ReduceMean

Reduces a tensor by computing the mean of elements along given dimensions as specified in axes. If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in axes.

Refer to ReduceMean backend definition for support of dynamic dimensions for each backend. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Input activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

axes

A list of dimensions along which to reduce. Each value must be in range [0,N-1] and must be listed only once.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <= N

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: K-dimensional, where K = N if keep_dims is true and K = max(1, N - M) otherwise.

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each non reduced dimension, if shape(in[0])[i] is dynamic then the corresponding dimension in out[0] must be dynamic. If keep_dims is set to true, the reduced dimensions are retained as static dimensions of length 1.

ReduceMin

Reduces a tensor by computing the minimum of elements along given dimensions as specified in axes. If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in axes.

References:

Inputs

in[0]

Input activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional

  • Constraints:

    • Shape: Rank > 0

Parameters

axes

A list of dimensions along which to reduce. Each value must be in range [0,N-1] and must be listed only once.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <= N

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: K-dimensional, where K = N if keep_dims is true and K = max(1, N - M) otherwise

  • Constraints:

    • Datatype: Same datatype as in[0]

ReduceProd

Reduces a tensor by computing the product of elements along given dimensions as specified in axes. If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in axes.

References:

Inputs

in[0]

Input activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional

  • Constraints:

    • Shape: Rank > 0

Parameters

axes

A list of dimensions along which to reduce. Each value must be in range [0,N-1] and must be listed only once.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <= N

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: K-dimensional, where K = N if keep_dims is true and K = max(1, N - M) otherwise

  • Constraints:

    • Datatype: Same datatype as in[0]

ReduceSum

Reduces a tensor by computing the sum of elements along given dimensions as specified in axes. If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in axes.

Refer to ReduceSum backend definition for support of dynamic dimensions for each backend. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Input activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

axes

A list of dimensions along which to reduce. Each value must be in range [0,N-1] and must be listed only once.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <= N

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank K, where K = N if keep_dims is true and K = max(1, N - M) otherwise

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each non-reduced dimension, if shape(in[0])[i] is dynamic then the corresponding dimension in out[0] must be dynamic. If keep_dims is set to true, the reduced dimensions are retained as static dimensions of length 1.

ReduceSumSquare

Reduces a tensor by computing the sum square of elements along given dimensions as specified in axes. If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in axes. If rank(in[0]) = 0, no reduction occurs and the output is simply in[0] squared.

References:

Inputs

in[0]

Input activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: N-dimensional

Parameters

axes

A list of dimensions along which to reduce. Each value must be in range [0,N-1] and must be listed only once. If rank(in[0]) = 0, this param is ignored.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <= N unless rank(in[0]) = 0

keep_dims

If true, the resulting tensor has the same number of dimensions as the input tensor. If rank(in[0]) = 0, this param is ignored.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output activations

  • Mandatory: true

  • Data type: backend specific

  • Shape: K-dimensional, where K = N if keep_dims is true and K = max(0, N - M) otherwise, where N = rank(in[0]) and M = size(axes)

  • Constraints:

    • Datatype: Same datatype as in[0]

Relu

The Rectified linear unit operation computes:

output = max(0, input).

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

    • Must have same data format as in[0] (e.g. both sparse or both dense)

Relu1

The Rectified linear 1 unit operation computes:

output = min(1.f, max(-1.f, input)).

DEPRECATED: Use ReluMinMax instead.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

Relu6

The Rectified linear 6 unit operation computes:

output = min(6, max(0, input)).

DEPRECATED: Use ReluMinMax instead.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

ReluMinMax

The Rectified Linear Unit Min Max operation computes:

output = min(max_value, max(min_value, input)), where min_value <= max_value

The ReluMinMax rectifies values within min_value and max_value. It can support different types of Relu operations. For example, for Relu6, use min_value=0 and max_value=6.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

min_value

The minimum value in Relu operation

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

max_value

The maximum value in Relu operation

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

Reshape

Reshapes a tensor of N dimension to an output shape of M dimension while retaining the values of the input tensor. The number of elements implied by the output shape must be the same as the shape of the input tensor. Note that N can equal M.

Refer to Reshape backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension

  • Dynamic Shape: All dimensions can be dynamic.

in[1]

Determines the shape of out[0]. Note that 0 and -1 are valid values where a 0 is interpreted as the corresponding dimension in in[0] and a -1, which may only occur up to one time if in[0] is dynamic, is a wildcard dimension that will be inferred from the remaining dimensions and the number of elements in in[0].

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • If out[0] has dynamic dimensions, in[1] must be present. Otherwise, in[1] is ignored.

    • Value: All values of 0 have to have an index < rank(in[0]).

    • Value: May contain a single -1 value if in[0] has dynamic dimensions.

    • Value: Values must be >= -1.

    • Value: The specified shape must be compatible with shape(in[0]) in keeping the number of elements the same.

Parameters

None

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of M dimension

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Shape(out[0])[i] == Shape(in[0])[i] if in[1][i] == 0

    • Dynamic Shape: Must have dynamic dimensions if in[0] has dynamic dimensions.

Resize

Resizes the spatial dimensions of an input tensor with shape [batch, D1, D2, …, Dn, channel], where D1, …, Dn are the spatial dimensions. Every value of the output tensor is calculated as a weighted average of sampling locations in the input tensor.

References:

Inputs

in[0]

Input image

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension with shape [batch, D1, D2, … Dn, channel], where D1, …, Dn are the spatial dimensions.

  • Constraints:

    • Shape: Rank > 0

Parameters

exclude_outside

If true, the weight of sampling locations outside the tensor will be set to 0 and the weight will be renormalized so that their sum is 1.0.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

transformation_mode

Determines how to transform the coordinates in the original input tensor to the coordinates in the resized tensor. Note that the coordinates of each dimension are transformed individually. Supported values are 0: HALF_PIXEL, 1: PYTORCH_HALF_PIXEL, 2: ALIGN_CORNERS, 3: ASYMMETRIC.

When transformation_mode = HALF_PIXEL:

\[x_{out} = (x_{in} + 0.5) * \mbox{scale} - 0.5\]

When transformation_mode = PYTORCH_HALF_PIXEL:

\[\begin{split}x_{out} &= (x_{in} + 0.5) * \mbox{scale} - 0.5\ (\text{if}\ \text{shape} (\mbox{out[0]})[axis_{x}] > 1;) \\ x_{out} &= 0\ (\text{otherwise}) \\\end{split}\]

When transformation_mode = ALIGN_CORNERS:

\[x_{out} = \frac{x_{in} * (\text{shape}(\mbox{out[0]})[axis_{x}] - 1)}{\text{shape} (\mbox{in[0]})[axis_{x}] - 1}\]

When transformation_mode = ASYMMETRIC:

\[x_{out} = x_{in} * \mbox{scale}\]

where

\(x_{in}\) is a coordinate of \(\text{shape} (\mbox{in[0]})[axis_{x}]\),

\(x_{out}\) is a coordinate of \(\text{shape} (\mbox{out[0]})[axis_{x}]\),

\(\mbox{scale} = \frac{\text{shape} (\mbox{out[0]})[axis_{x}]}{\text{shape} (\mbox{in[0]})[axis_{x}]}\).

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • HALF_PIXEL = 0,

    • PYTORCH_HALF_PIXEL = 1,

    • ALIGN_CORNERS = 2,

    • ASYMMETRIC = 3

interpolation_mode

Determines the interpolation method. Supported values are 0: NEAREST, 1: LINEAR, 2: CUBIC.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • NEAREST = 0,

    • LINEAR = 1,

    • CUBIC = 2

nearest_mode

Determines the rounding method used when interpolation_mode is set to NEAREST. Supported values are 0: ROUND_PREFER_FLOOR, 1: ROUND_PREFER_CEIL, 2: FLOOR, 3 : CEIL.

For the following example let x represent a value in the range (0, 1].

When QNN_OP_RESIZE_PARAM_NEAREST_MODE is set to QNN_OP_RESIZE_NEAREST_MODE_ROUND_PREFER_FLOOR :
floor(x) (if x <= 0.5)
ceil(x) otherwise.

When QNN_OP_RESIZE_PARAM_NEAREST_MODE is set to QNN_OP_RESIZE_NEAREST_MODE_ROUND_PREFER_CEIL :
ceil(x) (if x >= 0.5)
floor(x) otherwise.
  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • ROUND_PREFER_FLOOR = 0,

    • ROUND_PREFER_CEIL = 1,

    • FLOOR = 2,

    • CEIL = 3

  • Constraints:

    • interpolation_mode must be set to NEAREST for this parameter to be valid.

cubic_coeff

Coefficient used in cubic interpolation.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: -0.75

  • Constraints:

    • interpolation_mode must be set to CUBIC for this parameter to be valid.

Outputs

out[0]

Output image

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension after resizing.

  • Constraints:

    • Datatype: Same datatype as in[0]

ResizeBilinear

Resize a 4D image in the height and width dimensions, computing new pixel values by bilinear interpolation. Image will be distorted if the aspect ratio of the output does not match the aspect ratio of the input. The output height and width are defined by the shape of the output tensor, and not computed.

References:

Inputs

in[0]

Input image

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,height,width,depth]

Parameters

align_corners

If true, the centers of the 4 corner pixels of the input and output tensors are aligned, preserving the values at the corner pixels

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: If True half_pixel_centers parameter MUST be false

half_pixel_centers

True or False where value 0 is False and any other is True. If true, the pixels are assumed to be at (0.5,0.5).

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: If True align_corners parameter MUST be false

antialias

Determines if an antialiasing filter is used when downsampling. If set to true uses resampling filter by a factor of max(1, 1/scale) during downsampling so more input pixels contribute to an output pixel.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output image

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,new_height,new_width,depth]

  • Constraints:

    • Datatype: Same datatype as in[0]

ResizeNearestNeighbor

Resize a 4D image in the height and width dimensions, computing new pixel values by nearest-neighbor sampling. Image will be distorted if the aspect ratio of the output does not match the aspect ratio of the input. The output height and width are defined by the shape of the output tensor, and not computed.

References:

Inputs

in[0]

Input image

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,height,width,depth]

Parameters

align_corners

If true, the centers of the 4 corner pixels of the input and output tensors are aligned, preserving the values at the corner pixels.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: If True half_pixel_centers parameter MUST be false

half_pixel_centers

True or False where value 0 is False and any other is True. If true, the pixels are assumed to be at (0.5,0.5).

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: If True align_corners parameter MUST be false

Outputs

out[0]

Output image

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,new_height,new_width,depth]

  • Constraints:

    • Datatype: Same datatype as in[0]

RmsNorm

Applies Root Mean Square normalization to the input tensor across the dimensions determined by axes. Focuses on re-scaling invariance and normalizing the summed inputs.

Values in the output tensor are computed as

\[\mbox{out[0]}[b,h,w,c] = \frac{\mbox{in[0]}[b,h,w,c] * \gamma}{\sqrt{\frac{1}{n}\sum_{axes}^n(\mbox{in[0]}[b,h,w,c])^2 + \epsilon}} + \beta\]

where n is the number of elements summed across the dimensions provided by axes, gamma (\(\gamma\)) and beta (\(\beta\)) are inputs, and epsilon (\(\epsilon\)) is a parameter.

Refer to RmsNorm backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N, note that the last dimension in the input is the channel, [.., channel].

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

in[1]

gamma (\(\gamma\)) used for scaling normalized values. Is applied element-wise across the channel dimension of the input activation in[0]. This op supports Unidirectional broadcasting from in[1] to in[0]. The i-th dimension of gamma (\(\gamma\)) is given by

\[\mbox{shape}(\gamma)[\mbox{i}] = \mbox{shape}(\mbox{in[0]})[\mbox{axes[i]}]\]
  • Mandatory: false

  • Data type: backend specific

  • Shape: a tensor of rank M, where M = size(axes)

  • Dynamic Shape: All dimensions can be dynamic.

  • Default: {1,..,1}

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[0])[axes[i]] is dynamic then shape(in[1])[i] must be dynamic or must compatible for broadcasting.

in[2]

beta (\(\beta\)) used for re-centering invariance, which is disabled by default. Is applied element-wise across the channel dimension of the normalized input activation in[0]. This op supports Unidirectional broadcasting from in[2] to in[0]. The i-th dimension of beta (\(\beta\)) is given by

\[\mbox{shape}(\beta)[\mbox{i}] = \mbox{shape}(\mbox{in[0]})[\mbox{axes[i]}]\]
  • Mandatory: false

  • Data type: backend specific

  • Shape: a tensor of rank M, where M = size(axes)

  • Dynamic Shape: All dimensions can be dynamic.

  • Default: {0,..,0}

  • Constraints:

    • Dynamic Shape: For each dimension, if shape(in[0])[axes[i]] is dynamic then shape(in[2])[i] must be dynamic or must compatible for broadcasting.

Parameters

epsilon (\(\epsilon\))

(\(\epsilon\)) is used for mathematical stability to prevent division by 0.

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1e-06

axes

Determines the dimensions that Root Mean Squared Normalization is applied.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M]

  • Constraints:

    • Shape: 0 < M <=N

    • Value: Must be in range [0,N-1]

    • Value: Must be unique

Outputs

out[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic, then shape(out[0])[i] must be dynamic.

RoiAlign

Extract rectangular Regions of Interest from a feature map, and scale them to a uniform size by average pooling of bilinearly interpolated sample points within the region.

References:

Inputs

in[0]

Input feature map

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,height,width,num_features]

in[1]

RoIs : Elements may be interpreted as 4-tuples of (x1,y1,x2,y2) giving the upper-left and bottom-right corners of the RoI.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois,4]

in[2]

Batch index in the feature map to which each RoI corresponds. Positions in this input correspond to the same box as the same position in in[1].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32, backend specific

  • Shape: [num_rois]

Parameters

img_size_ratio

The ratio between the original image and the input feature map, in the form [height_ratio, width_ratio].

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: [2] with elements [height_ratio, width_ratio]

num_samples_y

The number of interpolated sample points to use in the height dimension for each RoI.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Default: -1

num_samples_x

The number of interpolated sample points to use in the width dimension for each RoI.

  • Mandatory: false

  • Data type: QNN_DATATYPE_INT_32

  • Shape: scalar

  • Default: -1

aligned

If true, shift the box coordinates by -0.5 for a better alignment with the neighboring pixels.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

allow_invalid_roi

When set to true invalid RoIs of in[1] are allowed. An invalid RoI is defined as having coordinate values where (x2 - x1 = 0) and (y2 - y1 = 0). Note that the corresponding feature map in out[0] is set to all zeros for the invalid RoI.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output feature map

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois,out_height,out_width,num_features]. Max and current values may differ for num_rois, and current value will be updated by the backend.

  • Constraints:

    • Datatype: Same datatype as in[0]

RoiPooling

Extract rectangular Regions of Interest from a feature map, and scale them to a uniform size by max pooling of bilinearly interpolated sample points within the region.

References:

Inputs

in[0]

Input feature map

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch,height,width,num_features]

in[1]

RoIs. Elements may be interpreted as 4-tuples of (x1,y1,x2,y2) giving the upper-left and bottom-right corners of the RoI.

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois,4]

in[2]

Batch index in the feature map to which each RoI corresponds. Positions in this input correspond to the same box as the same position in in[1].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [num_rois]

Parameters

img_size_ratio

The ratio between the original image and the input feature map, in the form [height_ratio, width_ratio].

  • Mandatory: true

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: [2] with elements [height_ratio, width_ratio]

Outputs

out[0]

Output feature map

  • Mandatory: true

  • Data type: backend specific

  • Shape: [num_rois,out_height,out_width,num_features]. Max and current values may differ for num_rois, and current value will be updated by the backend.

  • Constraints:

    • Datatype: Same datatype as in[0]

ScatterElements

Takes three input tensors: data, indices, and updates of the same rank N and optional parameter axis to create an output tensor. Out[0] is produced by creating a copy of in[0] and then updating it with values that are specified by in[2] at the index positions specified by in[1]. Note that for each entry in updates the target index of out[0] is the same index as the entry itself with the exception of the index-value in the axis dimension which is obtained from indices.

Example for a 3-D tensor case is performed as below with reduction set to QNN_OP_SCATTER_ELEMENTS_REDUCTION_NONE:

output[indices[i][j][k]][j][k] = updates[i][j][k] if axis = 0,
output[i][indices[i][j][k]][k] = updates[i][j][k] if axis = 1,
output[i][j][indices[i][j][k]] = updates[i][j][k] if axis = 2.

Example for a 3-D tensor case is performed as below with reduction set to QNN_OP_SCATTER_ELEMENTS_REDUCTION_ADD:

output[indices[i][j][k]][j][k] += updates[i][j][k] if axis = 0,
output[i][indices[i][j][k]][k] += updates[i][j][k] if axis = 1,
output[i][j][indices[i][j][k]] += updates[i][j][k] if axis = 2.

Example for a 3-D tensor case is performed as below with reduction set to QNN_OP_SCATTER_ELEMENTS_REDUCTION_MUL:

output[indices[i][j][k]][j][k] *= updates[i][j][k] if axis = 0,
output[i][indices[i][j][k]][k] *= updates[i][j][k] if axis = 1,
output[i][j][indices[i][j][k]] *= updates[i][j][k] if axis = 2.

Example for a 3-D tensor case is performed as below with reduction set to QNN_OP_SCATTER_ELEMENTS_REDUCTION_MAX:

output[indices[i][j][k]][j][k] = max(output[indices[i][j][k]][j][k], updates[i][j][k]) if axis = 0,
output[i][indices[i][j][k]][k] = max(output[i][indices[i][j][k]][k], updates[i][j][k]) if axis = 1,
output[i][j][indices[i][j][k]] = max(output[i][j][indices[i][j][k]], updates[i][j][k]) if axis = 2.

Note that for indices generated from other Operations (e.g. NonZero) we permit -1 to be provided as a value to indicate an index for ScatterElements Operation to skip/ignore.

References:

Inputs

in[0]

input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

in[1]

indices : contains the index values used in the axis dimension to scatter updates.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32

  • Shape: a tensor of rank N

  • Constraints:

    • Value: Indices must be in range [0, Shape(in[0])[axis] - 1]

in[2]

updates : values to scatter into out[0].

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Same shape as in[1]

    • Datatype: Same datatype as in[0]

Parameters

axis

The axis of out[0] to scatter on.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: Must be in range [0, N - 1]

reduction

Operation that is applied to all update values of in[2] to out[0] at the specified indices from in[1]. When reduction is set to “NONE”, values in in[2] are assigned to out[0] at the indices specified by in[1] and duplicate indices will override previous updated values. When reduction is set to “ADD”, values in in[2] are added to the values of out[0] at the indices specified by in[1] and duplicate indices are allowed. When reduction is set to “MUL”, values in in[2] are multiplied to the values of out[0] at the indices specified by in[1] and duplicate indices are allowed. When reduction is set to “MAX”, the maximum values between the values in in[2] and the values of out[0] are assigned to out[0] at the indices specified by in[1] and duplicate indices are allowed.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • NONE = 0,

    • ADD = 1,

    • MUL = 2,

    • MAX = 3

Outputs

out[0]

Output data

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Same shape as in[0]

    • Datatype: Same datatype as in[0]

ScatterNd

Takes three input tensors: data, indices, and updates to create an output tensor. out[0] is produced by creating a copy of in[0] and then updating it with values that are specified by in[2] at the index positions specified by in[1].

Note that for indices generated from other Operations (e.g. NonZero) we permit -1 to be provided as a value to indicate an index for ScatterNd Operation to skip/ignore.

References:

Inputs

in[0]

input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n

  • Constraints:

    • Shape: Rank > 0

in[1]

Tensor of indices to scatter updates to create output tensor. Note that when Shape(in[1])[-1] is equal to n each value of in[2] is specific to a single element of out[0]. Otherwise when Shape(in[1])[-1] is less than n each value of in[2] is specific to a slice of the output tensor.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32, QNN_DATATYPE_UINT_32

  • Shape: a tensor of rank k

  • Constraints:

    • Shape: Rank > 0

    • Shape: Shape(in[1])[-1] must be >= 0 and <= n

    • Value: Indices must be non-negative and within the range of the corresponding dimension of in[0]

in[2]

Updates to scatter into the output tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n + k - Shape(in[1])[-1] - 1, where n is the rank of in[0] and k is the rank of in[1].

  • Constraints:

    • Shape: Rank = n + k - Shape(in[1])[-1] - 1

    • Shape: The number of slices/elements to update should be equal to the number of indices provided.

Parameters

reduction

Operation that is applied to all update values of in[2] to out[0] at the specified indices from in[1]. When reduction is set to “NONE”, values in in[2] are assigned to out[0] at the indices specified by in[1] and duplicate indices will override previous updated values. When reduction is set to “ADD”, values in in[2] are added to the values of out[0] at the indices specified by in[1] and duplicate indices are allowed. When reduction is set to “MUL”, values in in[2] are multiplied to the values of out[0] at the indices specified by in[1] and duplicate indices are allowed.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • NONE = 0,

    • ADD = 1,

    • MUL = 2

Outputs

out[0]

Output data

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n

  • Constraints:

    • Shape: Same shape as in[0]

    • Datatype: Same datatype as in[2]

Shape

Generates a 1D output tensor containing the shape of the input tensor. Parameters start and end can be used to compute a slice of the input tensor’s shape.

References:

Inputs

in[0]

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension

  • Constraints:

    • Shape: Rank > 0

Parameters

start

Starting axis for slicing the shape of the input tensor. Note that the start axis is inclusive and will include the size of the start axis in the output tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: start must be in range [0, N-1]

end

Ending axis for slicing the shape of the input tensor. Note that the end axis specified is exclusive and will not include the size of the end axis in the output tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: N

  • Constraints:

    • Value: end must be in range [start + 1, N]

Outputs

out[0]

Shape of the input tensor.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [M] where M = end - start

Sigmoid

Computes the sigmoid activation function elementwise on an input tensor. The sigmoid function is defined as

sigmoid(x) = 1 / (1 + exp(-x))

where exp is exponentiation by the base of the natural logarithm.

Refer to Sigmoid backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Input feature map.

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

Output feature map.

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

Softmax

Computes data normalization exponentially on an input tensor given an optional positive scaling factor, beta. The computation is done element-wise per batch along the specified axis.

Refer to Softmax backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: N-1

  • Constraints:

    • Value: must be in range [0, N-1]

beta

  • Mandatory: false

  • Data type: QNN_DATATYPE_FLOAT_32

  • Shape: scalar

  • Default: 1.0

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of N dimension

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Same shape as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then shape(out[0])[i] must be dynamic.

SpaceToBatch

A type of tensor realignment operation that rearranges blocks of spatial data in batch dimension.

The op moves blocks of data of size (block_size[0] * block_size[1]) from the height and width dimensions of the input tensor into the batch dimension of the output tensor after optional padding.

References:

Inputs

in[0]

Input Activation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [batch, height, width, depth]

  • Constraints:

    • Shape: height must be divisible by block_size[0] after any paddings have been applied.

    • Shape: width must be divisible by block_size[1] after any paddings have been applied.

Parameters

block_size

Vector that represents block size along the height and width dimensions respectively.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [block_height, block_width]

  • Constraints:

    • Value: Elements must be >=1

pad_amount

Paddings along the height and width dimensions of the input tensor.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[pad_top, pad_bottom], [pad_left, pad_right]]

  • Default: [[0, 0], [0, 0]]

Outputs

out[0]

Output Activation.

Permuted output tensor with new spatial dimensions [output_height, output_width] defined by

output_height = (height + pad_top + pad_bottom) / block_size[0]
output_width = (width + pad_left + pad_right) / block_size[1]
  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [(batch * block_size[0] * block_size[1]), output_height, output_width, depth]

  • Constraints:

    • Datatype: Same datatype as in[0]

SpaceToDepth

A type of tensor realignment operation that rearranges blocks of spatial data into depth.

The op moves blocks of data of size (block_size[0] * block_size[1]) from the height and width dimensions of the input tensor into the depth dimension of the output tensor.

References:

Inputs

in[0]

Input Activation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [batch, height, width, depth]

  • Constraints:

    • Shape: height must be divisible by block_size[0]

    • Shape: width must be divisible by block_size[1]

Parameters

block_size

Vector that represents block size along the height and width dimensions respectively.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [block_height, block_width]

  • Constraints:

    • Value: Elements must be >=1

mode

Specifies the order in which elements of in[0] are rearranged. If QNN_OP_SPACE_TO_DEPTH_PARAM_MODE is set to QNN_OP_SPACE_TO_DEPTH_MODE_DCR then elements along the depth dimension are rearranged in the order of depth, column, and then row; if set to QNN_OP_SPACE_TO_DEPTH_MODE_CRD elements along the depth dimension are rearranged in the order of column, row, and then depth.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Values:

    • DCR = 0,

    • CRD = 1

Outputs

out[0]

Output Activation.

  • Mandatory: true

  • Data type: backend specific

  • Shape: 4D tensor of shape [batch, (height / block_size[0]), (width / block_size[1]), (depth * block_size[0] * block_size[1])]

  • Constraints:

    • Datatype: Same datatype as in[0]

SparseToDense

Convert a sparse tensor to a dense tensor.

References:

Inputs

in[0]

input

  • Mandatory: true

  • Data type: backend specific

  • Shape: tensor of rank N

  • Constraints:

    • Must be a sparse tensor.

Parameters

None

Outputs

out[0]

output

  • Mandatory: true

  • Data type: backend specific

  • Shape: tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Must not be a sparse tensor.

    • Shape: Same shape as in[0]

Split

Splits input tensor along a given axis into multiple output tensors according to split_index.

References:

Inputs

in[0]

Input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

Specifies axis to split on.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: must be in range [0, N-1]

split_index

1-D tensor specifying starting index of each split slice in axis. Index values must be in range [1, shape(in[0])[axis]-1] and split_index[i+1] > split_index[i]. axis is split into size(split_index)+1 slices.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [M]

  • Constraints:

    • Shape: M <= shape(in[0])[axis]

    • Value: Must be in range [1, shape(in[0])[axis]-1]

Outputs

out[0..m]

Resulting (M+1) output data tensors.

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: true

  • Data type: backend specific

  • Shape: rank(out[m]) = rank(in[0]), where sum(m, size(shape(out[m])[axis])) = size(shape(in[0])[axis])

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same rank as in[0]

Squeeze

Removes dimensions of size 1 from the shape of the input tensor in[0]. The number of elements implied by output tensor must be the same as the input tensor. This functionality can also be achieved using the Reshape operation. Note that user can prevent removing certain static dimensions of size 1 by expressing as such in shape(out[0]) when axes is not provided.

Refer to Squeeze backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: Rank = N, N > 1

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 1

Parameters

axes

Indices specifying the dimensions to squeeze.

If axes is not provided all static dimensions of size 1 not specified in shape(out[0]) are removed.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: 1D of shape [K]

  • Default: None

  • Constraints:

    • Value: Must be in range [0, N)

    • Value: Must be unique

    • Shape: shape(in[0])[axes[i]] must equal 1

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank M

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: M <= rank(in[0])

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic then the corresponding dimension of out[0] after being squeezed must be dynamic.

Stft

Computes the Short-time Fourier Transform of the signal.

References:

Inputs

in[0]

input signal

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, signal_length, 1]

in[1]

window : The window being slid over the signal. Note that the window is the size of each frame the STFT will process.

  • Mandatory: false

  • Data type: backend specific

  • Shape: [window_size], where window_size = frame_length when not provided.

  • Default: {1,..,1}

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: window_size must be equal to frame_length.

Parameters

frame_step

The number of samples to step between successive DFTs (Discrete Fourier Transforms).

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be > 0

frame_length

The size of the DFT (Discrete Fourier Transform).

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: window_size if window input is provided. Otherwise, signal_length.

  • Constraints:

    • Value: Must be <= signal_length

onesided

When onesided is set to 1 : Only values for frequencies in the range [0, 1, 2, .., floor(n_fft/2) + 1] are returned. Note this is based on the conjugate symmetry property of the fourier transform for real-valued signals, where X[m,w] = X[m,n_fft - w].

When onesided is set to 0 : the complete representation of the frequency content of the signal both positive and negative frequencies are provided.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

Outputs

out[0]

output activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, frames, frame_length//2 + 1, 2] if onesided is set to 1. Otherwise, [batch, frames, frame_length, 2]. Note frames = ((signal_length - frame_length)/frame_step) + 1

  • Constraints:

    • Datatype: Same datatype as in[0]

StridedSlice

Extract “slices” from a tensor by sampling in each dimension. Parameters are applied in the following order: new_axes_mask, ranges/end_mask/begin_mask, and then shrink_axis. new_axis_mask is applied to the input tensor to create an intermediate tensor and all other parameters are applied in respect to this intermediate tensor. If new_axis_mask is not provided then these parameters are applied directly to the input tensor. Equivalent to python:

out = in[begin[0]:end[0]:stride[0],...,begin[n-1]:end[n-1]:stride[n-1]]

Refer to StridedSlice backend definition for support of dynamic dimensions for each backend. Backends do not support dynamic dimensions unless stated otherwise.

References:

Inputs

in[0]

Input data

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank n

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

ranges

Specifies the slice range for each axis, in the form (begin,end,stride). All negative values for begin[i] and end[i] have shape(in[0])[i] added to them and then clipped to the following ranges depending on the stride value:

When stride > 0: Begin values are clipped to the range [0, shape(in[0])[i]] and End values are clipped to the range [0, shape(in[0])[i]].

When stride < 0: Begin values are clipped to the range [0, shape(in[0])[i] - 1] and End values are clipped to the range [-1, shape(in[0])[i] - 1]. For a negative stride to include index 0, we permit the end of the range to be -1. Note that ranges extend “one past the end” similar to a C++ iterator, but the slice will include the begin index.

Note when slicing a dynamic dimension to the end it is recommended to use INT_MAX when slicing forward and INT_MIN when slicing backwards. Negative strides are not supported for dynamic dimensions.

  • Mandatory: true

  • Data type: QNN_DATATYPE_INT_32

  • Shape: [rank(in[0]) + new,3] : [[begin_0, end_0, stride_0], …. , [begin_{n+new-1}, end_{n+new-1}, stride_{n+new-1}]], where new is the number of new axes that will be inserted.

  • Constraints:

    • Value: Stride must be nonzero

    • Value: Stride can only be negative if begin > end

    • Value: Negative strides are not supported for dynamic dimensions

begin_mask

A bit mask corresponding to the begin axes of ranges that indicates whether certain axes of the intermediate tensor are to be retained or ignored during the slicing. If the ith bit of begin_mask is set, ranges[i][0] will be ignored and the fullest possible range in that dimension is used instead.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

end_mask

A bit mask corresponding to the end axes of ranges that indicates whether certain axes of the intermediate tensor are to be retained or ignored during the slicing. If the ith bit of end_mask is set, ranges[i][1] will be ignored and the fullest possible range in that dimension is used instead.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

shrink_axes

A bit mask corresponding to the axes of the intermediate tensor to omit in the output shape. If the ith bit of new_axes_mask is also set it overrides any shrink_axes setting at the same bit. All bits set beyond the shape of the intermediate tensor will be ignored. Note that the begin range provided for any axes that is omitted will be used in respect to the other axes. E.g. Given a 3-D shape [10, 20, 30] and only the 0th axis being omitted by shrink_axes and the entire range is used for the other 2 axes. If range[0][0] = 5, which is the begin value for the 0th axis, then the output shape would be a 2-D shape of [20, 30] corresponding to the 6th slice of the 0th dimension.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: When the ith bit of shrink_axes_mask is set and the ith bit of new_axes_mask is not set then the begin value at ranges[i][0] must be in the range of [-shape(in[0])[i], shape(in[0])[i] - 1]

new_axes_mask

A bit mask to insert new axes at positions specified by the set bits. If the ith bit in new_axes_mask is set, a new length 1 dimension is inserted into the input tensor at this position and an intermediate tensor is created. ranges, end_mask, begin_mask applied to positions where a new axes has been inserted are ignored since any new axes will always be size 1.

Notes:

  • The most significant bit of new_axes_mask is defined by the number of ranges provided by the user, e.g. Given 8 sets of ranges the most significant bit would be the 7th bit.

  • From the 0th bit to the most significant bit in new_axis_mask there must be exactly rank(in[0]) number of unset bits.

  • All bits past the most significant bit will be ignored.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

Outputs

out[0]

Output data

Dimension at axis i is max(1, ceil( (ranges[i,1]-ranges[i,0])/ranges[i,2] ) ), except where omitted by shrink_axes, or ‘1’ where specified by new_axes_mask.

  • Mandatory: true

  • Data type: backend specific

  • Shape: rank is rank(in[0]) + new - shrink, where new is the number of new axes that will be inserted and shrink is the number of axes that will be omitted.

  • Dynamic Shape: All dimensions can be dynamic except for newly inserted dimensions determined by new_axes_mask.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic and either the corresponding bit of end_mask is set or there is no slicing being done at dimension i then the corresponding dimension of out[0] must be dynamic.

    • Dynamic Shape: For each dimension, if shape(in[0])[i] is dynamic and stride value at ranges[i][2] > 0 and begin value at ranges[i][0] >= 0 and either the end value provided at ranges[i][1] >= shape(in[0])[i] or is negative then the corresponding dimension of out[0] must be dynamic.

Tanh

Computes the hyperbolic tangent function elementwise over an input feature map.

References:

Inputs

in[0]

Input feature map.

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Shape: Rank > 0

Parameters

None

Outputs

out[0]

Output feature map.

  • Mandatory: true

  • Data type: backend specific

  • Shape: Any

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Shape: Same shape as in[0]

Tile

Creates a new tensor by tiling an input tensor. The input tensor is replicated along each dimension for as many times as specified by the multiples tensor.

shape(out[0])[i] = shape(in[0])[i] * multiples[i]

References:

Inputs

in[0]

Input tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

multiples

A 1D tensor of length N which contains the replication factor for each dimension.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [N], N = rank(in[0])

Outputs

out[0]

The tiled output tensor

shape(out[0])[i] = shape(in[0])[i] * multiples[i]
  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Datatype: Same datatype as in[0]

TopK

Find the K largest or smallest values along the last dimension at each position in a tensor, and return those values sorted, along with their indices in the input tensor.

References:

Inputs

in[0]

Input tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

k

The number of elements to extract from the input tensor at each position. Must be <= shape(in[0])[rank(in[0])-1].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

largest

Whether to return the top-K largest or smallest elements. If true, the largest elements are returned. Otherwise, the smallest elements are returned.

  • Mandatory: false

  • Data type: QNN_DATATYPE_BOOL_8

  • Shape: scalar

  • Default: 1

Outputs

out[0]

Sorted largest or smallest values of input tensor at each position.

  • Mandatory: true

  • Data type: backend specific

  • Shape: same as in[0], except the last dimension which is k

  • Constraints:

    • Datatype: Same datatype as in[0]

out[1]

Index values of elements of out[0] in in[0]. If two elements of out[0] in the same position have the same value, the one with the larger index will appear first.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32, QNN_DATATYPE_INT_32

  • Shape: same as in[0], except the last dimension which is k

  • Constraints:

    • Shape: Same shape as out[0]

Transpose

Transposes the input tensor to produce an output with the same data but axes permuted according to the perm tensor.

Refer to Transpose backend definition per backend for support of dynamic dimensions. Backends do not support dynamic dimensions unless stated otherwise.

shape(out[0])[i] = shape(in[0])[perm[i]]

References:

Inputs

in[0]

Input tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Shape: Rank > 0

Parameters

perm

The permutations of the dimensions of the input tensor. The tensor values should be in range [0,N-1] and each dimension must be listed only once.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [N], N = rank(in[0])

Outputs

out[0]

The permuted output tensor

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Dynamic Shape: All dimensions can be dynamic.

  • Constraints:

    • Datatype: Same datatype as in[0]

    • Dynamic Shape: For each dimension, if shape(in[0])[perm[i]] is dynamic, then shape(out[0])[i] must be dynamic.

TransposeConv1d

Performs the transpose 1D convolution operation. This operation is also known as “Deconvolution”. Application of the filter moves according to the specified stride. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported.

For regular transpose convolution, group is 1. Group field greater than 1 implies a grouped transpose convolution where a group of different filters is applied to each output channel group and the result is concatenated together.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, width, channel_in]

  • Constraints:

    • Shape: channel_in must be evenly divisible by group

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_width, channel_in, channel_out/group]

  • Constraints:

    • Shape: channel_out must be evenly divisible by group

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Default: {0,..,0}

Parameters

stride

Defines stride for 1D spatial (i.e. width) axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: Must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 1D spatial (i.e. width) axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [width_pad_before, width_pad_after]

group

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

output_padding

Controls the additional size added to the 1D spatial axes (i.e width) of the output shape. Note that output_padding is only used to find output shape, but does not actually add zero-padding to output.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 0

  • Constraints:

    • Value: Must be < stride

Outputs

out[0]

The output 1D spatial dimensions is a function of the filters, stride, and pad_amount.

width_out = floor(stride * (shape(in[0])[width] - 1) + shape(in[1])[width] - pad_amount[0] - pad_amount[1] + output_padding)
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, width_out, channel_out]

  • Constraints:

    • Datatype: Same datatype as in[0]

TransposeConv2d

Performs the transpose 2D convolution operation. This operation is also known as “Deconvolution”. Application of the filter moves according to the specified strides. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported.

For regular transpose convolution, group is 1. Group field greater than 1 implies a grouped transpose convolution where a group of different filters is applied to each output channel group and the result is concatenated together. Note that channel_out and channel_in must be evenly divisible by group.

Refer to TransposeConv2d backend definition for supported data type and layouts for each backend.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height, width, channel_in]

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_height, filter_width, channel_in, channel_out/group]

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Default: [0]

Parameters

stride

Defines stride for 2D spatial (i.e. height and width) axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] : [height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 2D spatial (i.e. height and width) axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2,2] with format [[height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

group

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

output_padding

Controls the additional size added to the 2D spatial axes (i.e height and width) of the output shape. Note that output_padding is only used to find output shape, but does not actually add zero-padding to output.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [2] with format [height_output_padding, width_output_padding]

  • Default: [0, 0]

  • Constraints:

    • Value: must be < corresponding strides dimension

Outputs

out[0]

The output 2D spatial dimensions are functions of the filters, stride, and pad_amount.

height_out = floor(stride[0] * (shape(in[0])[height] - 1) + shape(in[1])[height] - pad_amount[0,0] - pad_amount[0,1] + output_padding[0])
width_out = floor(stride[1] * (shape(in[0])[width] - 1) + shape(in[1])[width] - pad_amount[1,0] - pad_amount[1,1] + output_padding[1])
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, height_out, width_out, channel_out]

  • Constraints:

    • Datatype: Same datatype as in[0]

TransposeConv3d

Performs the transpose 3D convolution operation. This operation is also known as “Deconvolution”. Application of the filter moves according to the specified strides. For backends supporting quantized data types, clients can pass filters which are either quantized per-tensor or per-axis with possible constraints on the axis value that is supported.

For regular transpose convolution, group is 1. Group field greater than 1 implies a grouped transpose convolution where a group of different filters is applied to each output channel group and the result is concatenated together. Note that channel_out and channel_in must be evenly divisible by group.

Refer to TransposeConv3d backend definition for supported data type and layouts for each backend.

References:

Inputs

in[0]

input activation

  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth, height, width, channel_in]

in[1]

filters

  • Mandatory: true

  • Data type: backend specific

  • Shape: [filter_depth, filter_height, filter_width, channel_in, channel_out/group]

in[2]

biases

  • Mandatory: false

  • Data type: backend specific

  • Shape: [channel_out]

  • Default: [0]

Parameters

stride

Defines stride for 3D spatial (i.e. depth, height and width) axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [depth_stride, height_stride, width_stride]

  • Constraints:

    • Value: Strides must be > 0

pad_amount

Pad amount to be added to the beginning and end part of 3D spatial (i.e. depth, height, and width) axes of in[0].

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3,2] with format [[depth_pad_before, depth_pad_after], [height_pad_before, height_pad_after], [width_pad_before, width_pad_after]]

dilation

Dilation value along each spatial axis (i.e. depth, height, and width) of the filter.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [depth_dilation, height_dilation, width_dilation]

  • Default: [1, 1, 1]

  • Constraints:

    • Value: Dilations must be > 0

group

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Default: 1

output_padding

Controls the additional size added to the 3D spatial axes (i.e depth, height, and width) of the output shape. Note that output_padding is only used to find output shape, but does not actually add zero-padding to output.

  • Mandatory: false

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: [3] : [depth_output_padding, height_output_padding, width_output_padding]

  • Default: [0, 0, 0]

  • Constraints:

    • Value: must be < corresponding strides dimension

Outputs

out[0]

The output 3D spatial dimensions are functions of the filters, stride, and pad_amount.

dilated_filter_depth = (shape(in[1])[depth] - 1) * dilation[0] + 1
dilated_filter_height = (shape(in[1])[height] - 1) * dilation[1] + 1
dilated_filter_width = (shape(in[1])[width] - 1) * dilation[2] + 1
depth_out = floor(stride[0] * (shape(in[0])[depth] - 1) + dilated_filter_depth - pad_amount[0,0] - pad_amount[0,1] + output_padding[0])
height_out = floor(stride[1] * (shape(in[0])[height] - 1) + dilated_filter_height - pad_amount[1,0] - pad_amount[1,1] + output_padding[1])
width_out = floor(stride[2] * (shape(in[0])[width] - 1) + dilated_filter_width - pad_amount[2,0] - pad_amount[2,1] + output_padding[2])
  • Mandatory: true

  • Data type: backend specific

  • Shape: [batch, depth_out, height_out, width_out, channel_out]

  • Constraints:

    • Datatype: Same datatype as in[0]

UnPack

Unpacks input tensor along a given axis into shape(in[0])[axis] tensors with rank one lower than in[0] by chipping it along the axis dimension.

References:

Inputs

in[0]

Input tensor.

  • Mandatory: true

  • Data type: backend specific

  • Shape: a tensor of rank N

  • Constraints:

    • Shape: Rank > 0

Parameters

axis

Specifies axis to unpack on.

  • Mandatory: true

  • Data type: QNN_DATATYPE_UINT_32

  • Shape: scalar

  • Constraints:

    • Value: must be in range [0, N-1]

Outputs

out[0..m]

Resulting m = shape(in[0])[axis] output data tensors.

This tensor is repeated, meaning the same definition can apply to multiple tensors.

  • Mandatory: true

  • Data type: backend specific

  • Shape: rank(out[0..m]) = rank(in[0]) - 1

  • Constraints:

    • Datatype: Same datatype as in[0]