Basic Dialog¶
This section contains tutorials that pertain to running the Llama 2 7B model on HTP within a basic dialog.
HTP Backend Example Model Config and Backend Extension Config¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-htp.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
LLaMA model inference on Android
To run on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=75).
adb shell mkdir -p /data/local/tmp/
adb push ${QNN_SDK_ROOT}/bin/aarch64-android/genie-t2t-run /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libGenie.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnSystem.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpNetRunExtensions.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV${ARCH}Stub.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/hexagon-v${ARCH}/unsigned/libQnnHtpV${ARCH}Skel.so /data/local/tmp/
adb push <path to htp_backend_ext_config.json> /data/local/tmp/
adb push <path to llama2-7b-htp.json> /data/local/tmp/
adb push <path to tokenizer.json> /data/local/tmp/
adb push <path to model bin files> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-htp.json>
-p "What is the most popular cookie in the world?"