Preparation on Linux¶
The QNN Gen AI Transformer uses the qnn-genai-transformer-composer utility to prepare models for inference.
Preparation¶
Open a command shell on Linux host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Linux host
cd ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/
./qnn-genai-transformer-composer --quantize Z4
--outfile <output filename with complete path>.bin
--model <path-to-downloaded-LLama-model-directory>
Dialog JSON Configuration¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-genaitransformer.json. Note that the tokenizer path and
model bin fields will need to be updated based on your actual preparation steps.