18330990. ADAPTERS FOR QUANTIZATION simplified abstract (QUALCOMM Incorporated)

From WikiPatents
Jump to navigation Jump to search

ADAPTERS FOR QUANTIZATION

Organization Name

QUALCOMM Incorporated

Inventor(s)

Minseop Park of Young-in (KR)

Jaeseong You of Seoul (KR)

Simyung Chang of Suwon (KR)

Markus Nagel of Amsterdam (NL)

Chirag Sureshbhai Patel of San Diego CA (US)

ADAPTERS FOR QUANTIZATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 18330990 titled 'ADAPTERS FOR QUANTIZATION

Simplified Explanation

The abstract describes a processor-implemented method for adaptive quantization in an artificial neural network (ANN). The method involves incorporating a quantization module between two linear layers of the ANN to generate an adapted ANN model. The quantization module scales the weights and biases of the first linear layer based on a learnable parameter and scales the weights of the second linear layer based on the inverse of the learnable parameter.

  • The method is used to adaptively quantize an ANN model with multiple channels of target activations.
  • A quantization module is inserted between the first and second linear layers of the ANN.
  • The quantization module scales the weights and biases of the first linear layer based on a learnable parameter.
  • The quantization module scales the weights of the second linear layer based on the inverse of the learnable parameter.

Potential Applications

  • Artificial neural networks (ANNs) in various fields such as computer vision, natural language processing, and speech recognition.
  • Optimization of ANN models for efficient computation and memory usage.
  • Improving the performance and accuracy of ANNs in resource-constrained environments.

Problems Solved

  • Adaptively quantizing an ANN model with multiple channels of target activations.
  • Optimizing the weights and biases of linear layers in an ANN using a learnable quantization module parameter.
  • Addressing the challenges of computation and memory usage in ANNs.

Benefits

  • Improved efficiency in computation and memory usage of ANNs.
  • Enhanced performance and accuracy of ANNs in resource-constrained environments.
  • Flexibility in adapting the quantization of ANN models based on learnable parameters.


Original Abstract Submitted

A processor-implemented method for adaptive quantization in an artificial neural network (ANN) includes receiving an ANN model. The ANN model has multiple channels of target activations. A quantization module is incorporated between a first linear layer of the ANN and a second linear layer of the ANN to generate an adapted ANN. The quantization module scales a first set of weights and biases of the first linear layer based on a learnable quantization module parameter and scales a second set of weights of the second linear layer based on an inverse of the learnable quantization module parameter.