Preparation (LoRA)

Note

This section demonstrates LLaMA-2-7b model workflow with LoRA adapter via genie-t2t-run.

Adapter Download

Download Llama-2-7b lora adapter (French to English language translator) from https://huggingface.co/kaitchup/Llama-2-7b-mt-French-to-English.

Preparation

The following section demonstrates converting a LoRA adapter using qnn-genai-transformer-composer.

Open a command shell on Linux host and run:

# Make sure environment is setup as per instructions, or can cd into bin folder on Linux host
# Additionally do the following
export LD_LIBRARY_PATH=${QNN_SDK_ROOT}/lib/x86_64-linux-clang:$LD_LIBRARY_PATH
export PYTHONPATH=${QNN_SDK_ROOT}/lib/python/qti/aisw/genai:$PYTHONPATH

# LoRA adapter conversion command
cd ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/
./qnn-genai-transformer-composer --model <path-to-downloaded-LLama-model-directory>
                                 --outfile <output filename with complete path>.bin
                                 --lora <path-to-downloaded-Lora-adapter-directory>

Dialog JSON Configuration

See Genie Dialog JSON configuration string for details on the fields and what they mean. An example model config can be found at ${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-genaitransformer-lora.json. Note that the tokenizer path and model bin fields will need to be updated based on your actual preparation steps.

Inference

Choose your target platform for inference: