QUANTIZATION METHOD AND QUANTIZATION APPARATUS FOR WEIGHT OF NEURAL NETWORK, AND STORAGE MEDIUM

Organization Name

Inventor(s)

QUANTIZATION METHOD AND QUANTIZATION APPARATUS FOR WEIGHT OF NEURAL NETWORK, AND STORAGE MEDIUM - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240046086 titled 'QUANTIZATION METHOD AND QUANTIZATION APPARATUS FOR WEIGHT OF NEURAL NETWORK, AND STORAGE MEDIUM

Simplified Explanation

The disclosed patent application describes a quantization method and apparatus for the weight of a neural network, implemented on a crossbar-enabled analog computing-in-memory system. The method involves acquiring the distribution characteristic of the weight and determining an initial quantization parameter based on this characteristic to reduce quantization errors. This approach allows for better performance of the neural network model with reduced mapping overhead.

The quantization method does not pre-define the quantization method used but determines the quantization parameter based on the weight's distribution characteristic.
The method aims to reduce quantization errors in quantizing the weight, improving the effectiveness of the neural network model.
By reducing the mapping overhead, the method ensures a smaller mapping overhead while maintaining the same effect of the neural network model.

Potential applications of this technology:

Artificial intelligence and machine learning: The quantization method can be applied to neural networks used in various AI and ML applications, such as image recognition, natural language processing, and autonomous systems.
Edge computing: Implementing the quantization method on edge devices can enable efficient and low-power neural network processing, making it suitable for applications where real-time decision-making is required.

Problems solved by this technology:

Quantization errors: The method addresses the issue of quantization errors that occur when reducing the precision of weights in a neural network, improving the accuracy and performance of the model.
Mapping overhead: By reducing the mapping overhead, the method optimizes the utilization of resources in the neural network, leading to improved efficiency and reduced computational requirements.

Benefits of this technology:

Improved neural network performance: By reducing quantization errors, the method enhances the accuracy and effectiveness of neural network models, leading to better results in various applications.
Reduced computational requirements: The smaller mapping overhead achieved through the quantization method results in improved efficiency and reduced computational resources needed for neural network processing.

Original Abstract Submitted

disclosed are a quantization method and quantization apparatus for a weight of a neural network, and a storage medium. the neural network is implemented on the basis of a crossbar-enabled analog computing-in-memory (cacim) system, and the quantization method includes: acquiring a distribution characteristic of a weight; and determining, according to the distribution characteristic of the weight, an initial quantization parameter for quantizing the weight to reduce a quantization error in quantizing the weight. the quantization method provided by the embodiments of the present disclosure does not pre-define the quantization method used, but determines the quantization parameter used for quantizing the weight according to the distribution characteristic of the weight to reduce the quantization error, so that the effect of the neural network model is better under the same mapping overhead, and the mapping overhead is smaller under the same effect of the neural network model.

20240046086. QUANTIZATION METHOD AND QUANTIZATION APPARATUS FOR WEIGHT OF NEURAL NETWORK, AND STORAGE MEDIUM simplified abstract (TSINGHUA UNIVERSITY)

Contents

QUANTIZATION METHOD AND QUANTIZATION APPARATUS FOR WEIGHT OF NEURAL NETWORK, AND STORAGE MEDIUM

Organization Name

Inventor(s)