Function GenieTokenizer_encode

Function Documentation

Genie_Status_t GenieTokenizer_encode(const GenieTokenizer_Handle_t tokenizerHandle, const char *inputString, const Genie_AllocCallback_t callback, const int32_t **tokenIds, uint32_t *numTokenIds)

A function to encode input text into token ids.

Parameters
  • tokenizerHandle[in] A handle to the tokenizer. Must not be NULL.

  • inputString[in] Null-terminated Input string. Must not be NULL.

  • callback[in] A callback function to allocate tokenIds. Must not be NULL.

  • tokenIds[out] The encoded token ids. The associated buffer was allocated in the client defined allocation callback and the memory needs to be managed by the client.

  • numTokenIds[out] The number of encoded token ids.

Returns

Status code:

  • GENIE_STATUS_SUCCESS: API call was successful.

  • GENIE_STATUS_ERROR_INVALID_HANDLE: Tokenizer handle is invalid.

  • GENIE_STATUS_ERROR_INVALID_ARGUMENT: At least one argument is invalid.

  • GENIE_STATUS_ERROR_MEM_ALLOC: Memory allocation failure.

  • GENIE_STATUS_ERROR_GENERAL: Tokenizer Encode failure.