Eaglet Dialog - LoRA & Draft Switching

See Genie Dialog JSON configuration string for details on the fields and what they mean. An example model_config can be found at ${QNN_SDK_ROOT}/examples/Genie/configs/llama3-3b/llama3-3b-eaglet-htp.json. Note that the tokenizer path and context binary fields will need to be updated based on your actual preparation steps.

Note

Use LLaMA-3-3b notebook’s for generating AR-N models.

To run using eaglet with LoRA switching and draft engine switching on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=79). Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config.

adb shell mkdir -p /data/local/tmp/
adb push <path to llama3-3b-htp.json> /data/local/tmp/
adb push <path to standalone-engine.json> /data/local/tmp/
adb push <path to lora bin files> /data/local/tmp/

# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH

cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama3-3b-htp.json>
                -p "How to make an apple pie?"
                --engine_role target
                -l elementary,alpha0,0.5,alpha1,0.2
                --allow_engine_switch draft,<path to standalone_engine.json>

Eaglet Dialog - LoRA Switching

To run using eaglet only with LoRA switching on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=79). Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config.

adb shell mkdir -p /data/local/tmp/
adb push <path to llama3-3b-htp.json> /data/local/tmp/
adb push <path to lora bin files> /data/local/tmp/

# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH

cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama3-3b-htp.json>
                -p "How to make an apple pie?"
                --engine_role target
                -l elementary,alpha0,0.5,alpha1,0.2