Block Ops ONNX Usage¶
This section describes how to use the Block Ops in models in the source frameworks like ONNX.
Block ops use the special domain name qti_aisw.
Buffer Usage¶
This section describes how to add a Buffer op in the original ONNX model.
Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.
Export ONNX model to ONNX Script¶
import onnx
from onnxscript.backend import onnx_export
model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
f.write(model_code)
This produces an ONNX Script representation of the model.
Original model - ONNX Script code¶
import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15
@script()
def MyModel(data: FLOAT['b','x','y','d']) -> (FLOAT['b','x','y','d']):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
return_val = opset15.Sub(oneInt64, data_0)
return return_val
Add a Block Op (Buffer)¶
Using the QNN block op (Buffer) requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).
from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.buffer import Buffer
QnnBufferBlockOp = Buffer(onnx_opset_version=15, aisw_opset_version=1)
QnnBuffer = QnnBufferBlockOp.getOnnxScriptFunc()
# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT['b','x','y','d'], mask: INT64['b','x']) -> (FLOAT['b','x','y','d']):
def MyQnnModel(data: FLOAT['b','x','y','d']) -> (FLOAT['b','x','y','d']):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
return_val = opset15.Sub(oneInt64, data_0)
# Add an additional buffer op here
buffer_result = QnnBuffer(return_val, opset15.Constant(value_int=0), 4, 1)
return buffer_result
This model can be exported to standard ONNX.
Export and save¶
model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')
The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.
MaskedSoftmax Usage¶
This section describes how to write and replace a subgraph with an equivalent block op in ONNX.
Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.
Export ONNX model to ONNX Script¶
import onnx
from onnxscript.backend import onnx_export
model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
f.write(model_code)
This produces an ONNX Script representation of the model.
Original model - ONNX Script code¶
import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15
@script()
def MyModel(data: FLOAT['b','x','y','d'], mask: INT64['b','x']) -> (FLOAT['b','x','y','d']):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
masked_data = opset15.Add(data_0, mask)
softmax_result = opset15.Softmax(masked_data)
return_val = opset15.Sub(oneInt64, softmax_result)
return return_val
Replace with Block Op (MaskedSoftmax)¶
Using the QNN block op (MaskedSoftmax) requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).
from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.masked_softmax import MaskedSoftmax
QnnMaskedSoftmaxBlockOp = MaskedSoftmax(onnx_opset=15, aisw_opset_version=1)
QnnMaskedSoftmax = QnnMaskedSoftmaxBlockOp.getOnnxScriptFunc()
# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT['b','x','y','d'], mask: INT64['b','x']) -> (FLOAT['b','x','y','d']):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
# Comment out masking and softmax
# masked_data = opset15.Add(data_0, mask)
# softmax_result = opset15.Softmax(masked_data)
# Replace with call to QNN Block Op ONNX function
softmax_result = QnnMaskedSoftmax(data_0, mask, mode=0)
return_val = opset15.Sub(oneInt64, softmax_result)
return return_val
This model can be exported to standard ONNX.
Export and save¶
model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')
The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.
StatefulGru Usage¶
This section describes how to write and replace a common GRU op with an StatefulGru op in ONNX. When the ‘reset’ value is False, the StatefulGru has the same function with common GRU. If the ‘reset’ value is True, internal initial_h state is reset by the input initial_h value for each time setp gru layer.
Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.
Export ONNX model to ONNX Script¶
import onnx
from onnxscript.backend import onnx_export
model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
f.write(model_code)
This produces an ONNX Script representation of the model.
Original model - ONNX Script code¶
import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15
@script()
def MyModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4]) -> (FLOAT[4, 1, 1, 4]):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
sub_val = opset15.Sub(oneInt64, data_0)
return_val = opset15.GRU(sub_val,
W,
R,
B,
4,
h)
return return_val
Add a StatefulGru Op¶
Using the QNN StatefulGru op requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).
from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.stateful_gru import StatefulGru
QnnStatefulGruBlockOp = StatefulGru(onnx_opset_version=15, aisw_opset_version=1)
QnnStatefulGru = QnnStatefulGruBlockOp.getOnnxScriptFunc()
# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4]) -> (FLOAT[4, 1, 1, 4]):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
sub_val = opset15.Sub(oneInt64, data_0)
reset = opset15.Constant(value=True)
gru_result = QnnStatefulGru(sub_val,
W,
R,
4,
B,
4,
h,
reset,
1.0)
return gru_result
This model can be exported to standard ONNX.
Export and save¶
model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')
The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.
This section describes how to replace a common GRU op with an StatefulGru op in ONNX.
Replace Gru ops to StatefulGru ops¶
import onnx
from qti.aisw.converters.block_ops.onnx.stateful_gru import replaceAllOnnxGruWithBlockOp
model = onnx.load('model.onnx')
new_model = replaceAllOnnxGruWithBlockOp(model)
onnx.save(new_model, 'model_updated.onnx')
Conversion for models including StatefulGru ops¶
Currently, both qnn-onnx-converter and qairt-converter commands are used to convert a serialized
ONNX model to an equivalent QNN representation.
Note: We need to use ‘–target_backend LPAI’ option to make the ops work. ‘–multi_time_steps_lstm’ option
is disabled later.
Example¶
qnn-onnx-converter --input_network model.onnx --output_path model.cpp --target_backend LPAI
qairt-converter --input_network model.onnx --output_path model.dlc --target_backend LPAI
StatefulLstm Usage¶
This section describes how to write and replace a common LSTM op with an StatefulLstm op in ONNX. When the ‘reset’ value is False, the StatefulLstm has the same function with common LSTM. If the ‘reset’ value is True, internal initial_h and initial_c states are reset by the input initial_h and initial_c values for each time setp lstm layer.
Most models are not written natively in ONNX, so effort is required to load a model into a human-readable format. Various publicly available tools like ONNX Script and ONNX Python API exist for doing this. In these examples we use ONNX Script.
Export ONNX model to ONNX Script¶
import onnx
from onnxscript.backend import onnx_export
model = onnx.load('my_model.onnx')
model_code = onnx_export.export2python(model)
with open('my_model.py', 'w') as f:
f.write(model_code)
This produces an ONNX Script representation of the model.
Original model - ONNX Script code¶
import numpy
from onnx import TensorProto
from onnx.helper import make_tensor
from onnxscript import script, external_tensor
from onnxscript.values import Opset
from onnxscript.onnx_types import FLOAT, INT64
from onnxscript.onnx_opset import opset15
@script()
def MyModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4], P:FLOAT[1, 12]) -> (FLOAT[4, 1, 1, 4]):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
sub_val = opset15.Sub(oneInt64, data_0)
return_val = opset15.LSTM(sub_val,
W,
R,
B,
4,
h,
h,
P)
return return_val
Add a StatefulLstm Op¶
Using the QNN StatefulLstm op requires that we construct a QnnOnnBlockOp object that uses the same ONNX opset version as the original model. Here, opset 15 is used. Additionally, we specify an opset version for QNN’s block ops (only version 1 is accepted right now).
from onnxscript.onnx_opset import opset15
from qti.aisw.converters.block_ops.onnx.stateful_lstm import StatefulLstm
QnnStatefulLstmBlockOp = StatefulLstm(onnx_opset_version=10, aisw_opset_version=1)
QnnStatefulLstm = QnnStatefulLstmBlockOp.getOnnxScriptFunc()
# This can then be used directly in the ONNX Script model definition:
@script()
def MyQnnModel(data: FLOAT[4, 1, 16], W: FLOAT[1, 12, 16], R: FLOAT[1, 12, 4], B: FLOAT[1, 24], h:FLOAT[1, 1, 4], P:FLOAT[1, 12]) -> (FLOAT[4, 1, 1, 4]):
oneInt64 = opset15.Constant(value_int=1)
data_0 = opset15.Add(data, oneInt64)
sub_val = opset15.Sub(oneInt64, data_0)
reset = opset15.Constant(value=True)
lstm_result = QnnStatefulLstm(sub_val,
W,
R,
4,
B,
4,
h,
h,
P,
reset,
1.0)
return lstm_result
This model can be exported to standard ONNX.
Export and save¶
model = MyQnnModel.to_model_proto()
import onnx
onnx.save_model(model, 'my_model.onnx')
The resulting my_model.onnx file can be used natively on ONNX Runtime or fed to the ONNX converter: Onnx Conversion as part of the QAIRT workflow.
This section describes how to replace a common LSTM op with an StatefulLstm op in ONNX.
Replace LSTM ops to StatefulLstm ops¶
import onnx
from qti.aisw.converters.block_ops.onnx.stateful_lstm import replaceAllOnnxLstmWithBlockOp
model = onnx.load('model.onnx')
new_model = replaceAllOnnxLstmWithBlockOp(model)
onnx.save(new_model, 'model_updated.onnx')
Conversion for models including StatefulLstm ops¶
Currently, both qnn-onnx-converter and qairt-converter commands are used to convert a serialized
ONNX model to an equivalent QNN representation.
Note: We need to use ‘–target_backend LPAI’ option to make the ops work. ‘–multi_time_steps_lstm’ option
is disabled later.
Example¶
qnn-onnx-converter --input_network model.onnx --output_path model.cpp --target_backend LPAI
qairt-converter --input_network model.onnx --output_path model.dlc --target_backend LPAI