KV Share Dialog¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b-genaitransformer-htp-kv-share.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
Note
Use LLaMA-2-7b notebook’s for generating AR-N models.
KV-SHARE uses QNN HTP backend for prompt processing and QNN Gen AI Transformer backend for token generation. To run using KV-SHARE dialog, open a command shell on android and run the following. Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config.
Note
Using different backends for primary and secondary engine, ensure that libraries and binaries of both the engines are present.
adb shell mkdir -p /data/local/tmp/
adb push <path to llama2-7b-genaitransformer-htp-kv-share.json> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
export ADSP_LIBRARY_PATH=$LD_LIBRARY_PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-genaitransformer-htp-kv-share.json>
-p "What is the most popular cookie in the world?"