Block Ops ONNX Usage

This section describes how to use the Block Ops in models in the source frameworks like ONNX.

Block ops use the special domain name qti_aisw.


Buffer Usage

This section describes how to add a Buffer op in the original ONNX model.

Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.

Export ONNX model to ONNX Script

import onnx
from onnxscript.backend import onnx_export

model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
    f.write(model_code)

This produces an ONNX Script representation of the model.

Original model - ONNX Script code

import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15

@script()
def MyModel(data: FLOAT['b','x','y','d']) -> (FLOAT['b','x','y','d']):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)
    return_val = opset15.Sub(oneInt64, data_0)
    return return_val

Add a Block Op (Buffer)

Using the QNN block op (Buffer) requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).

from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.buffer import Buffer

QnnBufferBlockOp = Buffer(onnx_opset_version=15, aisw_opset_version=1)
QnnBuffer = QnnBufferBlockOp.getOnnxScriptFunc()

# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT['b','x','y','d'], mask: INT64['b','x']) -> (FLOAT['b','x','y','d']):
    def MyQnnModel(data: FLOAT['b','x','y','d']) -> (FLOAT['b','x','y','d']):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)
    return_val = opset15.Sub(oneInt64, data_0)

    # Add an additional buffer op here
    buffer_result = QnnBuffer(return_val, opset15.Constant(value_int=0), 4, 1)

    return buffer_result

This model can be exported to standard ONNX.

Export and save

model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')

The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.


MaskedSoftmax Usage

This section describes how to write and replace a subgraph with an equivalent block op in ONNX.

Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.

Export ONNX model to ONNX Script

import onnx
from onnxscript.backend import onnx_export

model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
    f.write(model_code)

This produces an ONNX Script representation of the model.

Original model - ONNX Script code

import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15

@script()
def MyModel(data: FLOAT['b','x','y','d'], mask: INT64['b','x']) -> (FLOAT['b','x','y','d']):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)
    masked_data = opset15.Add(data_0, mask)
    softmax_result = opset15.Softmax(masked_data)
    return_val = opset15.Sub(oneInt64, softmax_result)
    return return_val

Replace with Block Op (MaskedSoftmax)

Using the QNN block op (MaskedSoftmax) requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).

from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.masked_softmax import MaskedSoftmax

QnnMaskedSoftmaxBlockOp = MaskedSoftmax(onnx_opset=15, aisw_opset_version=1)
QnnMaskedSoftmax = QnnMaskedSoftmaxBlockOp.getOnnxScriptFunc()

# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT['b','x','y','d'], mask: INT64['b','x']) -> (FLOAT['b','x','y','d']):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)

    # Comment out masking and softmax
    # masked_data = opset15.Add(data_0, mask)
    # softmax_result = opset15.Softmax(masked_data)

    # Replace with call to QNN Block Op ONNX function
    softmax_result = QnnMaskedSoftmax(data_0, mask, mode=0)

    return_val = opset15.Sub(oneInt64, softmax_result)
    return return_val

This model can be exported to standard ONNX.

Export and save

model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')

The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.


StatefulGru Usage

This section describes how to write and replace a common GRU op with an StatefulGru op in ONNX. When the ‘reset’ value is False, the StatefulGru has the same function with common GRU. If the ‘reset’ value is True, internal initial_h state is reset by the input initial_h value for each time setp gru layer.

Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.

Export ONNX model to ONNX Script

import onnx
from onnxscript.backend import onnx_export

model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
    f.write(model_code)

This produces an ONNX Script representation of the model.

Original model - ONNX Script code

import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15

@script()
def MyModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4]) -> (FLOAT[4, 1, 1, 4]):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)
    sub_val = opset15.Sub(oneInt64, data_0)
    return_val = opset15.GRU(sub_val,
                             W,
                             R,
                             B,
                             4,
                             h)
    return return_val

Add a StatefulGru Op

Using the QNN StatefulGru op requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).

from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.stateful_gru import StatefulGru

QnnStatefulGruBlockOp = StatefulGru(onnx_opset_version=15, aisw_opset_version=1)
QnnStatefulGru = QnnStatefulGruBlockOp.getOnnxScriptFunc()

# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4]) -> (FLOAT[4, 1, 1, 4]):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)
    sub_val = opset15.Sub(oneInt64, data_0)

    reset = opset15.Constant(value=True)
    gru_result = QnnStatefulGru(sub_val,
                                W,
                                R,
                                4,
                                B,
                                4,
                                h,
                                reset,
                                1.0)
    return gru_result

This model can be exported to standard ONNX.

Export and save

model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')

The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.

This section describes how to replace a common GRU op with an StatefulGru op in ONNX.

Replace Gru ops to StatefulGru ops

import onnx
from qti.aisw.converters.block_ops.onnx.stateful_gru import replaceAllOnnxGruWithBlockOp

model = onnx.load('model.onnx')
new_model = replaceAllOnnxGruWithBlockOp(model)
onnx.save(new_model, 'model_updated.onnx')

Conversion for models including StatefulGru ops

Currently, both qnn-onnx-converter and qairt-converter commands are used to convert a serialized ONNX model to an equivalent QNN representation. Note: We need to use ‘–target_backend LPAI’ option to make the ops work. ‘–multi_time_steps_lstm’ option is disabled later.

Example

qnn-onnx-converter --input_network model.onnx --output_path model.cpp --target_backend LPAI
qairt-converter --input_network model.onnx --output_path model.dlc --target_backend LPAI

StatefulLstm Usage

This section describes how to write and replace a common LSTM op with an StatefulLstm op in ONNX. When the ‘reset’ value is False, the StatefulLstm has the same function with common LSTM. If the ‘reset’ value is True, internal initial_h and initial_c states are reset by the input initial_h and initial_c values for each time setp lstm layer.

Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.

Export ONNX model to ONNX Script

import onnx
from onnxscript.backend import onnx_export

model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
    f.write(model_code)

This produces an ONNX Script representation of the model.

Original model - ONNX Script code

import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15

@script()
def MyModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4], P:FLOAT[1, 12]) -> (FLOAT[4, 1, 1, 4]):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)
    sub_val = opset15.Sub(oneInt64, data_0)
    return_val = opset15.LSTM(sub_val,
                              W,
                              R,
                              B,
                              4,
                              h,
                              h,
                              P)
    return return_val

Add a StatefulLstm Op

Using the QNN StatefulLstm op requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).

from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.stateful_lstm import StatefulLstm

QnnStatefulLstmBlockOp = StatefulLstm(onnx_opset_version=10, aisw_opset_version=1)
QnnStatefulLstm = QnnStatefulLstmBlockOp.getOnnxScriptFunc()

# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4], P:FLOAT[1, 12]) -> (FLOAT[4, 1, 1, 4]):
    oneInt64 = opset15.Constant(value_int=1)
    data_0 = opset15.Add(data, oneInt64)
    sub_val = opset15.Sub(oneInt64, data_0)

    reset = opset15.Constant(value=True)
    lstm_result = QnnStatefulLstm(sub_val,
                                  W,
                                  R,
                                  4,
                                  B,
                                  4,
                                  h,
                                  h,
                                  P,
                                  reset,
                                  1.0)
    return lstm_result

This model can be exported to standard ONNX.

Export and save

model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')

The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.

This section describes how to replace a common LSTM op with an StatefulLstm op in ONNX.

Replace LSTM ops to StatefulLstm ops

import onnx
from qti.aisw.converters.block_ops.onnx.stateful_lstm import replaceAllOnnxLstmWithBlockOp

model = onnx.load('model.onnx')
new_model = replaceAllOnnxLstmWithBlockOp(model)
onnx.save(new_model, 'model_updated.onnx')

Conversion for models including StatefulLstm ops

Currently, both qnn-onnx-converter and qairt-converter commands are used to convert a serialized ONNX model to an equivalent QNN representation. Note: We need to use ‘–target_backend LPAI’ option to make the ops work. ‘–multi_time_steps_lstm’ option is disabled later.

Example

qnn-onnx-converter --input_network model.onnx --output_path model.cpp --target_backend LPAI
qairt-converter --input_network model.onnx --output_path model.dlc --target_backend LPAI