18330990. ADAPTERS FOR QUANTIZATION simplified abstract (QUALCOMM Incorporated)
Contents
ADAPTERS FOR QUANTIZATION
Organization Name
Inventor(s)
Markus Nagel of Amsterdam (NL)
Chirag Sureshbhai Patel of San Diego CA (US)
ADAPTERS FOR QUANTIZATION - A simplified explanation of the abstract
This abstract first appeared for US patent application 18330990 titled 'ADAPTERS FOR QUANTIZATION
Simplified Explanation
The abstract describes a processor-implemented method for adaptive quantization in an artificial neural network (ANN). The method involves incorporating a quantization module between two linear layers of the ANN to generate an adapted ANN model. The quantization module scales the weights and biases of the first linear layer based on a learnable parameter and scales the weights of the second linear layer based on the inverse of the learnable parameter.
- The method is used to adaptively quantize an ANN model with multiple channels of target activations.
- A quantization module is inserted between the first and second linear layers of the ANN.
- The quantization module scales the weights and biases of the first linear layer based on a learnable parameter.
- The quantization module scales the weights of the second linear layer based on the inverse of the learnable parameter.
Potential Applications
- Artificial neural networks (ANNs) in various fields such as computer vision, natural language processing, and speech recognition.
- Optimization of ANN models for efficient computation and memory usage.
- Improving the performance and accuracy of ANNs in resource-constrained environments.
Problems Solved
- Adaptively quantizing an ANN model with multiple channels of target activations.
- Optimizing the weights and biases of linear layers in an ANN using a learnable quantization module parameter.
- Addressing the challenges of computation and memory usage in ANNs.
Benefits
- Improved efficiency in computation and memory usage of ANNs.
- Enhanced performance and accuracy of ANNs in resource-constrained environments.
- Flexibility in adapting the quantization of ANN models based on learnable parameters.
Original Abstract Submitted
A processor-implemented method for adaptive quantization in an artificial neural network (ANN) includes receiving an ANN model. The ANN model has multiple channels of target activations. A quantization module is incorporated between a first linear layer of the ANN and a second linear layer of the ANN to generate an adapted ANN. The quantization module scales a first set of weights and biases of the first linear layer based on a learnable quantization module parameter and scales a second set of weights of the second linear layer based on an inverse of the learnable quantization module parameter.