Tutorials¶
This section contains tutorials that assist users in becoming familiar with the Genie workflow. Tutorials are split into two sections: one for the QNN GenAITransformer backend and one for the QNN HTP backend.
Table of Contents
Note
Please refer to Setup before starting any of the tutorials.
QNN GenAITransformer backend workflow¶
The Genie provided QNN GenAITransformer backend leverages the QNN op package interface to represent an entire LLaMA
model as a single op. The model execution engine is provided via the QnnGenAiTransformerCpuOpPkg op package library. The
Genie packages a prebuilt QnnGenAiTransformerModel model library. The corresponding source for this model library
can be found at ${SDK_ROOT}/examples/Genie/Model/model.cpp. Because the QNN GenAITransformer backend model and op
package are prebuilt, this backend uses the qnn-genai-transformer-composer tool for preparation.
Model download¶
Download Llama-2-7b-chat-hf from https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/tree/main
Model conversion¶
The following section demonstrates converting a model using qnn-genai-transformer-composer.
Model conversion on Linux and Android
Open a command shell on Linux host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Linux host
cd ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/
./qnn-genai-transformer-composer --quantize Z4
--outfile <output filename with complete path>.bin
--model <path-to-downloaded-LLama-model-directory>
Model conversion on Windows
Open Developer PowerShell for VS2022 on Windows host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Windows host
cd ${QNN_SDK_ROOT}>\bin\x86_64-windows-msvc
python .\qnn-genai-transformer-composer --quantize Z4
--outfile <output filename with complete path>.bin
--model <path-to-downloaded-LLama-model-directory>
Model configuration¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-genaitransformer.json. Note that the tokenizer path and
model bin fields will need to be updated based on your actual preparation steps.
Model execution¶
The following section demonstrates running a model on the QNN GenAITransformer backend using genie-t2t-run.
Model execution on Linux
Open a command shell on Linux host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Linux host
cd <QNN_SDK_ROOT>\bin\x86_64-linux-clang
./genie-t2t-run -c <path to cpu_model_config.json>
-p "Tell me about Qualcomm"
Model execution on Android
Open a command shell on Linux host and run:
# make sure a test device is connected
adb devices
# push artifacts to device
adb push ${QNN_SDK_ROOT}/bin/aarch64-android/genie-t2t-run /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libGenie.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformer.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformerCpuOpPkg.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformerModel.so /data/local/tmp/
adb push <path to tokenizer.json> /data/local/tmp/
adb push <path to cpu-model-config.json> /data/local/tmp/
adb push <path to model bin file, e.g. <path-to-downloaded-LLama-model-directory>/model.bin> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to cpu-model-config.json>
-p "Tell me about Qualcomm"
Model execution on Windows
Open Developer PowerShell for VS2022 on Windows on Snapdragon host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Windows host
cd <QNN_SDK_ROOT>\bin\aarch64-windows-msvc
.\genie-t2t-run.exe -c <path to cpu-model-config.json>
-p "Tell me about Qualcomm"
Bge-large model inference using GenAiTransformer on Android¶
See Genie Embedding JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/bge-large-genaitransformer.json. Note that the tokenizer path and
model bin fields will need to be updated based on your actual preparation steps.
Note
Use qnn-genai-transformer-composer without --quantize Z4 option to generate the model binary.
To run on QNN GenAiTransformer backend, open a command shell on android and run the following:
Note
Results will be saved in output.raw file in working directory.
# make sure a test device is connected
adb devices
# push artifacts to device
adb push ${QNN_SDK_ROOT}/bin/aarch64-android/genie-t2e-run /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libGenie.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformer.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformerCpuOpPkg.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformerModel.so /data/local/tmp/
adb push <path to tokenizer.json> /data/local/tmp/
adb push <path to cpu-model-config.json> /data/local/tmp/
adb push <path to model bin file, e.g. <path-to-downloaded-BGE-model-directory>/model.bin> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp
cd $LD_LIBRARY_PATH
./genie-t2e-run -c <path to bge-large-genaitransformer.json>
-p "Tell me about Qualcomm "
Building the example model library¶
Building an example model library is optional.
In the case of the GenAITransformer backend, the model is composed of a single custom op implemented by the pre-built
libQnnGenAiTransformerCpuOpPkg.so and QnnGenAiTransformerCpuOpPkg.dll op packages.
The Genie provides pre-built libQnnGenAiTransformerModel.so and QnnGenAiTransformerModel.dll libraries as
described in the Introduction. The source for these libraries is provided by
${QNN_SDK_ROOT}/examples/Genie/Model/model.cpp. This section shows the user how to compile this source into a model
library consumable by the Genie.
Model build on Linux host
Open a command shell on Linux host and run:
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator \
-c ${QNN_SDK_ROOT}/examples/GenAiTransformer/Model/model.cpp \
-o ${QNN_SDK_ROOT}/examples/GenAiTransformer/Model/model_libs # This can be any path
This will produce the following artifacts:
${QNN_SDK_ROOT}/examples/GenAiTransformer/Model/model_libs/aarch64-android/libqnn_model.so${QNN_SDK_ROOT}/examples/GenAiTransformer/Model/model_libs/x86_64-linux-clang/libqnn_model.so
- By default libraries are built for all targets. To compile for a specific target, use the
-t <target>option with qnn-model-lib-generator. Choices of <target> are aarch64-android and x86_64-linux-clang.
QNN GPU backend workflow¶
The following tutorial demonstrates running a model on the QNN GPU backend using genie-t2t-run.
Note
This section assumes that the QNN GPU context binaries have been obtained via the QNN workflow.
GPU Backend Example Model Config¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-gpu.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
LLaMA model inference on Android
To run on QNN GPU backend, open a command shell on android and run the following.
adb shell mkdir -p /data/local/tmp/
adb push ${QNN_SDK_ROOT}/bin/aarch64-android/genie-t2t-run /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libGenie.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGpu.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnSystem.so /data/local/tmp/
adb push <path to llama2-7b-gpu.json> /data/local/tmp/
adb push <path to tokenizer.json> /data/local/tmp/
adb push <path to model bin file> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-gpu.json>
-p "Tell me about Qualcomm"
QNN HTP backend workflow¶
The following tutorial demonstrates running a model on the QNN HTP backend using genie-t2t-run.
Note
This section assumes that the QNN HTP context binaries have been obtained via the QNN workflow.
HTP Backend Example Model Config and Backend Extension Config¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-htp.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps. There is also a Windows specific
configuration file located here: ${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-htp-windows.json.
An example backend_ext_config.json can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/htp_backend_ext_config.json.
For more information on the QNN HTP backend extension configurations options, please refer to
${QNN_SDK_ROOT}/docs/QNN/general/htp/htp_backend.html.
LLaMA model inference on Android
To run on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=75).
adb shell mkdir -p /data/local/tmp/
adb push ${QNN_SDK_ROOT}/bin/aarch64-android/genie-t2t-run /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libGenie.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnSystem.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpNetRunExtensions.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV${ARCH}Stub.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/hexagon-v${ARCH}/unsigned/libQnnHtpV${ARCH}Skel.so /data/local/tmp/
adb push <path to htp_backend_ext_config.json> /data/local/tmp/
adb push <path to llama2-7b-htp.json> /data/local/tmp/
adb push <path to tokenizer.json> /data/local/tmp/
adb push <path to model bin files> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-htp.json>
-p "What is the most popular cookie in the world?"
LLaMA model inference on Windows
Open Developer PowerShell for VS2022 on Windows on Snapdragon host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Windows host
cd <QNN_SDK_ROOT>\bin\aarch64-windows-msvc
.\genie-t2t-run.exe -c <path to llama2-7b-htp.json>
-p "Tell me about Qualcomm"
LLaMA-2-7b model inference using SSD-Q1 on Android¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b-htp-ssd.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
Note
Use LLaMA-2-7b notebook’s for generating AR-N models.
To run using SSD on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=79). Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config.
adb shell mkdir -p /data/local/tmp/
adb push <path to llama2-7b-htp-ssd.json> /data/local/tmp/
adb push <path to forecast-prefix-dir> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-htp-ssd.json>
-p "What is the most popular cookie in the world?"
LLaMA-2-7b model inference using LADE on Android¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b-htp-lade.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
Note
Use LLaMA-2-7b notebook’s for generating AR-N models.
To run using LADE on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=79). Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config.
adb shell mkdir -p /data/local/tmp/
adb push <path to llama2-7b-htp-lade.json> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-htp-lade.json>
-p "What is the most popular cookie in the world?"
LLaMA-2-7b model LoRA inference using HTP on Android¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b-htp-lora.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
Note
Use LLaMA-2-7b notebook’s for generating AR-N models.
To run using LoRA on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=79). Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config.
adb shell mkdir -p /data/local/tmp/
adb push <path to llama2-7b-htp-lora.json> /data/local/tmp/
adb push <path to lora bin files> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-htp-lora.json>
-p "What is the most popular cookie in the world?"
-l lora1,alpha,0.5
Model download¶
Download Llama-2-7b lora adapter (French to English language translator) from https://huggingface.co/kaitchup/Llama-2-7b-mt-French-to-English
Model conversion¶
The following section demonstrates converting a lora adapter using qnn-genai-transformer-composer.
Model conversion on Linux
Open a command shell on Linux host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Linux host
# Additionally do the following
export LD_LIBRARY_PATH=${QNN_SDK_ROOT}/lib/x86_64-linux-clang:$LD_LIBRARY_PATH
export PYTHONPATH=${QNN_SDK_ROOT}/lib/python/qti/aisw/genai:$PYTHONPATH
# lora adapter conversion command
cd ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/
./qnn-genai-transformer-composer --model <path-to-downloaded-LLama-model-directory>
--outfile <output filename with complete path>.bin
--lora <path-to-downloaded-Lora-adapter-directory>
Model configuration¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-genaitransformer-lora.json. Note that the tokenizer path and
model bin fields will need to be updated based on your actual preparation steps.
Model execution¶
The following section demonstrates running a model on the QNN GenAiTransformer backend using genie-t2t-run.
Model execution on Linux
Open a command shell on Linux host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Linux host
cd <QNN_SDK_ROOT>\bin\x86_64-linux-clang
./genie-t2t-run -c <path to llama2-7b-genaitransformer-lora.json>
-p "Le certificat peut être imprimé dans une ou plusieurs langues de la convention et doit être complété dans l'une de ces langues."
--lora lora1,alpha,1
Model execution on Android
Open a command shell on Linux host and run:
# make sure a test device is connected
adb devices
# push artifacts to device
adb push ${QNN_SDK_ROOT}/bin/aarch64-android/genie-t2t-run /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libGenie.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformer.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformerCpuOpPkg.so /data/local/tmp/
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGenAiTransformerModel.so /data/local/tmp/
adb push <path to tokenizer.json> /data/local/tmp/
adb push <path to llama2-7b-genaitransformer-lora.json> /data/local/tmp/
adb push <path to model bin file, e.g. <path-to-converted-LLama-model-directory>/model.bin> /data/local/tmp/
adb push <path to lora adapter bin file, e.g. <path-to-converted-lora-adapter-directory>/lora_adapter.bin> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-genaitransformer-lora.json>
-p "Le certificat peut être imprimé dans une ou plusieurs langues de la convention et doit être complété dans l'une de ces langues."
--lora lora1,alpha,1
LLaMA-2-7b model inference using SPD on Android¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b-draft-htp-target-htp-spd.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
Note
Use LLaMA-2-7b notebook’s for generating AR-N models for target.
Note
Use Small Sized LLM models ex. 115M for draft.
To run using SSD on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=79). Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config. .. note:: If using different backends for target and draft, ensure the libraries and binaries both the engines are present.
adb shell mkdir -p /data/local/tmp/
adb push <path to llama2-7b-draft-htp-target-htp-spd.json> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-draft-htp-target-htp-spd.json>
-p "What is the most popular cookie in the world?"
Bge-large model inference using HTP on Android¶
See Genie Embedding JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/bge-large-htp.json. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.
Note
Use regular QNN flow to get required context binaries of BGE model.
To run on QNN HTP backend, open a command shell on android and run the following. This assumes that the HTP architecture has been set (e.g., ARCH=79). Please use the steps mentioned above for libraries, binaries, tokenizer and backend_ext_config.
Note
Results will be saved in output.raw file in working directory.
adb shell mkdir -p /data/local/tmp/
adb push <path to bge-large-htp.json> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
cd $LD_LIBRARY_PATH
./genie-t2e-run -c <path to bge-large-htp.json>
-p "What is the most popular cookie in the world?"
Genie sample tutorial¶
Warning
libGenie.so is subject to change without notice.
Genie sample pre-requisites¶
- Building libGenie.so has three external dependencies:
clang compiler
ndk-build (for Android targets only)
RUST
If the clang compiler is not available in your system PATH, the script ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh provided with the SDK can be used to install and prepare your environment. Alternatively, you could install these dependencies and make them available in your PATH.
Command to automatically install required dependencies:
1 $ sudo bash ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh
For the second dependency to be satisfied, ndk-build needs to be set, which you can check with:
1 $ ${QNN_SDK_ROOT}/bin/envcheck -n
Note: libGenie.so has been verified to work with Android NDK version r26c and clang14.
For the third dependency, RUST, run the following command in a terminal:
1 $ export RUSTUP_HOME=</path/for/rustup>
2 $ mkdir -p ${RUSTUP_HOME}
3 $ export CARGO_HOME=</path/for/cargo>
4 $ mkdir -p ${CARGO_HOME}
5 $ curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh
6 $ source ${CARGO_HOME}/env
7 $ rustup target add aarch64-linux-android
Building libGenie.so¶
x86
1 $ cd ${SDK_ROOT}/examples/Genie/Genie
2 $ make x86
After executing make as shown above, you should be able to see libGenie.so in lib/x86_64-linux-clang
Android
1 $ cd ${SDK_ROOT}/examples/Genie/Genie
2 $ make android
After executing make as shown above, you should be able to see libGenie.so in lib/aarch64-android You can now proceed to link this library to your app and call APIs exposed by libGenie.so
Sample genie-t2t-run tutorial¶
genie-t2t-run sample pre-requisites¶
genie-t2t-run depends on libGenie.so. Please follow its instructions above on how to build it.
Building sample genie-t2t-run¶
x86
1 $ cd ${SDK_ROOT}/examples/Genie/genie-t2t-run
2 $ make x86
After executing make as shown above, you should be able to see genie-t2t-run in bin/x86_64-linux-clang
Android
1 $ cd ${SDK_ROOT}/examples/Genie/genie-t2t-run
2 $ make android
After executing make as shown above, you should be able to see genie-t2t-run in bin/aarch64-android
Executing sample genie-t2t-run¶
You can follow QNN GenAITransformer backend workflow and QNN HTP backend workflow for the instructions to run on CPU and HTP respectively.
Model inference using token to token feature on Android¶
See Genie Dialog JSON configuration string for details on the fields and what
they mean. An example model_config can be found at
${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b-genaitransformer.json.
Note
Use LLaMA-2-7b notebook’s for generating AR-N models.
adb shell mkdir -p /data/local/tmp/
adb push <path to llama2-7b-genaitransformer.json> /data/local/tmp/
adb push <path to token file(.txt)> /data/local/tmp/
# open adb shell
adb shell
export LD_LIBRARY_PATH=/data/local/tmp/
export PATH=$LD_LIBRARY_PATH:$PATH
export ADSP_LIBRARY_PATH=$LD_LIBRARY_PATH
cd $LD_LIBRARY_PATH
./genie-t2t-run -c <path to llama2-7b-genaitransformer-htp-kv-share.json>
-tok <path to token file(.txt)>
# Example tokenfile.txt
24948 592 1048 15146 2055
Model inference using token to token feature on Windows¶
Open Developer PowerShell for VS2022 on Windows on Snapdragon host and run:
# Make sure environment is setup as per instructions, or can cd into bin folder on Windows host
cd <QNN_SDK_ROOT>\bin\aarch64-windows-msvc
.\genie-t2t-run.exe -c <path to cpu-model-config.json>
-tok <path to token file(.txt)>
# Example tokenfile.txt
24948 592 1048 15146 2055
Update sampler params tutorial¶
Note
Please refer to ${SDK_ROOT}/examples/Genie/configs/sampler.json for the parameters that can be updated
Genie provides the flexibility of updating a single parameter and multiple params in one API call
The APIs used for this exercise are:
GenieSamplerConfig_createFromJson
GenieDialog_getSampler
GenieDialogSampler_applyConfig
Example on how to update sampler parameters in between queries¶
1# Create dialog config
2GenieDialogConfig_Handle_t dialogConfigHandle = NULL;
3GenieDialogConfig_createFromJson(dialogConfigStr, &dialogConfigHandle);
4
5# Create dialog
6GenieDialog_Handle_t dialogHandle = NULL;
7GenieDialog_create(dialogConfigHandle, &dialogHandle);
8
9# Query with original config
10GenieDialog_query(dialogHandle, promptStr, GenieDialog_SentenceCode_t::GENIE_DIALOG_SENTENCE_COMPLETE, queryCallback)
11
12# Get dialog sampler handle
13GenieDialogSampler_Handle_t samplerHandle = NULL;
14GenieDialog_getSampler(dialogHandle, &samplerHandle);
15
16# Create sampler config with a new sampler config
17GenieSamplerConfig_Handle_t samplerConfigHandle = NULL;
18GenieSamplerConfig_createFromJson(samplerConfigStr, &samplerConfigHandle);
19
20# Apply the new sampler config
21GenieDialogSampler_applyConfig(samplerHandle, samplerConfigHandle);
22
23# Query with updated config
24GenieDialog_query(dialogHandle, promptStr, GenieDialog_SentenceCode_t::GENIE_DIALOG_SENTENCE_COMPLETE, queryCallback)
25
26# Update single parameter
27GenieSamplerConfig_setParam(samplerConfigHandle, "top-p", "0.8");
28GenieSamplerConfig_setParam(samplerConfigHandle, "top-k", "30");
29
30# Apply the new sampler config
31GenieDialogSampler_applyConfig(samplerHandle, samplerConfigHandle);
32
33# Query with updated config
34GenieDialog_query(dialogHandle, promptStr, GenieDialog_SentenceCode_t::GENIE_DIALOG_SENTENCE_COMPLETE, queryCallback)
35
36# Update multiple parameters(top-k and top-p)
37std::string valueStr = "\"sampler\" : {\n \"top-k\" : 20,\n \"top-p\" : 0.75\n } "
38GenieSamplerConfig_setParam(samplerConfigHandle, "", valueStr.c_str());
39
40# Apply the new sampler config
41GenieDialogSampler_applyConfig(samplerHandle, samplerConfigHandle);
42
43# Query with updated config
44GenieDialog_query(dialogHandle, promptStr, GenieDialog_SentenceCode_t::GENIE_DIALOG_SENTENCE_COMPLETE, queryCallback)