Struct Qnn_VectorEncoding_t

Struct Documentation

struct Qnn_VectorEncoding_t

A struct to express vector quantization parameters.

Note

This quantization encoding is a specific case of per-channel quantization where the weights and parameters are crafted in such a way to allow for compression and codebook generation. For each group of rowsPerBlock*columnsPerBlock weights, there will be 2^indexBitwidth unique vectorDimension-tuples of weights.

Note

This quantization encoding must not be used with dynamically shaped tensors.

Public Members

Qnn_BwAxisScaleOffset_t bwAxisScaleOffset

Vector Quantization can be thought of as per-channel quantization with specifically crafted weights and encoding parameters that allow for codebook generation Each weight within the codebook is bwAxisScaleOffset.bitwidth bits wide

uint32_t rowsPerBlock

Number of rows in the block of decoded weight coordinates.

uint32_t columnsPerBlock

Number of colums inf the block of decoded weight coordinates.

uint8_t vectorDimension

The dimension of the vector encoding. e.g 1D,2D,3D… for 1, 2 or 3 weights per index, respectively.

uint8_t vectorStride

A value describing how the weights from a given lookup will be unpacked.

uint8_t indexBitwidth

The bitwidth of the each index into the codebook.