Tutorial - Preparing and Executing a Model with TFLite Delegate

The workflow for preparing and executing a model on a target device with the TFLite Delegate mirrors the QNN workflow except for three changes:

  1. In addition to the generic backend from QNN (ex. libQnnHtp.so) we need to send the TFLite Delegate (libQnnTFLiteDelegate.so).

  2. In order to run the model, we need to send and run qtld-net-run instead of qnn-net-run.

  3. For Android target devices, we also need to send the qtld-release.aar file.

These only affect the very end of the steps in the QNN workflow. Specifically right when we are transferring files over to the target device, there are additional things we need to do.

Step 1: Complete most of the CNN to QNN tutorial

  1. Follow most of the CNN to QNN tutorial, but stop when you reach a step that uses qnn-net-run. This will be one of the last commands you have to run in the tutorial, and is located in the sections with the target backend names (Ex. HTP, GPU, or DSP).

    1. The CNN to QNN tutorial will take you through:

      1. Installing the QAIRT SDK (which includes both QNN and TFLite Delegate files)

      2. Converting your model into a QNN format for your target device

      3. Moving over the proper files to the target device

      4. Running your model using qnn-net-run (which we will replace with qtld-net-run using the below steps once you get there).

  2. Once you reach the qnn-net-run step, return to this page and follow Step 2 onwards.

Step 2: Running an inference with qtld-net-run

  1. Based on the target device’s OS, architecture, and optionally gcc version, choose the corresponding path to qtld-net-run & libQnnTFLiteDelegate.so. For the HTP backend example, use $QNN_TARGET_ARCH as aarch64-android and $HEXAGON_TARGET_ARCH as hexagon-v75.

    Target OS

    Architecture

    GCC Version

    qtld-net-run & libQnnTFLiteDelegate.so Path

    Ubuntu

    aarch64

    gcc9.4

    • ${QNN_SDK_ROOT}/bin/aarch64-ubuntu-gcc9.4/qtld-net-run

    • ${QNN_SDK_ROOT}/lib/aarch64-ubuntu-gcc9.4/libQnnTFLiteDelegate.so

    Android

    aarch64

    -

    • ${QNN_SDK_ROOT}/bin/aarch64-android/qtld-net-run

    • ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnTFLiteDelegate.so

    OpenEmbedded Linux

    aarch64

    gcc9.3

    • ${QNN_SDK_ROOT}/bin/aarch64-oe-linux-gcc9.3/qtld-net-run

    • ${QNN_SDK_ROOT}/lib/aarch64-oe-linux-gcc9.3/libQnnTFLiteDelegate.so

    OpenEmbedded Linux

    aarch64

    gcc11.2

    • ${QNN_SDK_ROOT}/bin/aarch64-oe-linux-gcc11.2/qtld-net-run

    • ${QNN_SDK_ROOT}/lib/aarch64-oe-linux-gcc11.2/libQnnTFLiteDelegate.so

  2. From your host machine, transfer qtld-net-run & libQnnTFLiteDelegate.so to your target device by running:

    $ adb shell mkdir -p /data/local/tmp/inception_v3
    $ # Push the QNN Delegate library
    $ adb push $QNN_SDK_ROOT/lib/$QNN_TARGET_ARCH/libQnnTFLiteDelegate.so /data/local/tmp/inception_v3
    $ adb push $QNN_SDK_ROOT/bin/$QNN_TARGET_ARCH/qtld-net-run /data/local/tmp/inception_v3
    
  3. From your host machine, transfer Qualcomm® AI Engine Direct backend libraries to your target device by running:

    $ # Push QNN backend ARM libraries
    $ adb push $QNN_SDK_ROOT/lib/$QNN_TARGET_ARCH/libQnnHtp.so /data/local/tmp/inception_v3
    $ adb push $QNN_SDK_ROOT/lib/$QNN_TARGET_ARCH/libQnnHtpPrepare.so /data/local/tmp/inception_v3
    $ adb push $QNN_SDK_ROOT/lib/$QNN_TARGET_ARCH/libQnnSystem.so /data/local/tmp/inception_v3
    $ adb push $QNN_SDK_ROOT/lib/$QNN_TARGET_ARCH/libQnnHtpV75Stub.so /data/local/tmp/inception_v3
    $ # Push Hexagon (eg. QNN HTP) backends, push the Skel:
    $ adb push $QNN_SDK_ROOT/lib/$HEXAGON_TARGET_ARCH/unsigned/libQnnHtpV75Skel.so /data/local/tmp/inception_v3
    
  4. From your host machine, transfer model, data and input list to your target device by running:

    $ # Set the environment variable TENSORFLOW_HOME to point to the location where TensorFlow package is installed
    $ # TensorFlow 2.10.1 and 2.17.0 has been tested and is compatible with this tutorial
    $
    $ # Generate and convert inceptionV3 model to .tflite model
    $ python3 $QNN_SDK_ROOT/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d
    $ python3 $QNN_SDK_ROOT/examples/QNN/TFLiteDelegate/Models/InceptionV3Quant/scripts/convert_inceptionv3_tflite.py
    $ # Push model, data and input list
    $ adb push $QNN_SDK_ROOT/examples/QNN/TFLiteDelegate/Models/InceptionV3Quant/inception_v3_quant.tflite /data/local/tmp/inception_v3
    $ adb push $QNN_SDK_ROOT/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3
    $ adb push $QNN_SDK_ROOT/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
    
  5. From your host machine, execute the model on your target device by running:

    $ adb shell 'export LD_LIBRARY_PATH=/data/local/tmp/inception_v3/:$LD_LIBRARY_PATH && export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3/" &&
                cd /data/local/tmp/inception_v3/ && /data/local/tmp/inception_v3/qtld-net-run \
                --model /data/local/tmp/inception_v3/inception_v3_quant.tflite \
                --input /data/local/tmp/inception_v3/target_raw_list.txt  \
                --output /data/local/tmp/inception_v3/tflite-output \
                --backend htp'
    
  6. You should see a folder named tflite-output which contains the results of your run.