Struct Qnn_VectorEncoding_t¶
Defined in File QnnTypes.h
Struct Documentation¶
-
struct Qnn_VectorEncoding_t¶
A struct to express vector quantization parameters.
Note
This quantization encoding is a specific case of per-channel quantization where the weights and parameters are crafted in such a way to allow for compression and codebook generation. For each group of rowsPerBlock*columnsPerBlock weights, there will be 2^indexBitwidth unique vectorDimension-tuples of weights.
Note
This quantization encoding must not be used with dynamically shaped tensors.
Public Members
-
Qnn_BwAxisScaleOffset_t bwAxisScaleOffset¶
Vector Quantization can be thought of as per-channel quantization with specifically crafted weights and encoding parameters that allow for codebook generation Each weight within the codebook is bwAxisScaleOffset.bitwidth bits wide
-
uint32_t rowsPerBlock¶
Number of rows in the block of decoded weight coordinates.
-
uint32_t columnsPerBlock¶
Number of colums inf the block of decoded weight coordinates.
-
uint8_t vectorDimension¶
The dimension of the vector encoding. e.g 1D,2D,3D… for 1, 2 or 3 weights per index, respectively.
-
uint8_t vectorStride¶
A value describing how the weights from a given lookup will be unpacked.
-
uint8_t indexBitwidth¶
The bitwidth of the each index into the codebook.
-
Qnn_BwAxisScaleOffset_t bwAxisScaleOffset¶