METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION

Organization Name

Inventor(s)

METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240185029 titled 'METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION

Simplified Explanation

The abstract describes a method and apparatus for neural network quantization, where a quantized neural network is generated by analyzing weight differences, determining layers to be quantized with lower-bit precision, and generating a second neural network with the quantized layers.

Learning of a neural network is performed to obtain weight differences between initial and updated weights.
Weight differences for each layer are analyzed statistically.
One or more layers are determined to be quantized with lower-bit precision based on the analyzed statistic.
A second neural network is generated by quantizing the determined layers with lower-bit precision.

Potential Applications

The technology can be applied in various fields such as:

Edge computing
Internet of Things (IoT) devices
Mobile applications

Problems Solved

This technology addresses the following issues:

Reducing memory and computational requirements of neural networks
Improving efficiency and speed of neural network operations

Benefits

The benefits of this technology include:

Optimizing neural network performance
Enhancing energy efficiency in neural network applications

Potential Commercial Applications

The technology can be commercially applied in:

Autonomous vehicles
Healthcare diagnostics
Speech recognition systems

Possible Prior Art

One possible prior art for this technology could be:

Research papers on neural network quantization techniques from academic institutions.

What are the potential limitations of this technology?

Potential limitations of this technology may include:

Loss of precision in quantized layers leading to reduced accuracy.
Complexity in determining the optimal layers for quantization.

How does this technology compare to existing neural network quantization methods?

This technology stands out by:

Utilizing statistical analysis of weight differences to determine layers for quantization.
Generating a second neural network with quantized layers based on the analyzed statistics.

Original Abstract Submitted

according to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.

Samsung electronics co., ltd. (20240185029). METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION simplified abstract

Contents

METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION

Organization Name

Inventor(s)