Avoid low depth Activations (Network examples)

Below are few examples in which the original graph can be modified to achieve better performance by fusing Conv ops to achieve higher depth:

  1. The following network has 2 inputs and each goes to a separate Conv op

../../_static/resources/htp_guidelines_fig_19.png

Figure 1

The Conv on the left has 1x1 filter and the Conv on the right has 2x2 filter and 2x2 stride.

It is possible to change the Conv with 2x2 stride and 2x2 filter to a Conv with 1x1 filter by applying 2x2 S2D (space to depth transformation) to its input. Once this is done, the 2 Convs can be concatenated into a single convolution with number of output channels equal to the sum of the output channel sizes of the original convolutions.

The modified network is shown below

../../_static/resources/htp_guidelines_fig_20.png

Figure 2

It is best to modify the graph before training without impacting training or inference results while also achieving best possible performance on QNN HTP.

  1. Here is another example from a network where the output from the Conv at top fans out into a large number branches

../../_static/resources/htp_guidelines_fig_21.png

Figure 3

In this example the outputs from last set of Convs have very low channel depth, which is inefficient. Additionally, many of the these Conv ops share same activation function.

In these type of networks, the Convs can be grouped together based on the activation function (Relu, softmax, etc). Then these similar Convs can be concatenated to achieve more channels.

The modified network is shown below

../../_static/resources/htp_guidelines_fig_22.png

Figure 4

The output, Out_0 and Out_7 is the combined (concatenated) outputs from 2 branches containing ‘Softmax’ activation.