18532795. INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION simplified abstract (Intel Corporation)
Contents
- 1 INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION
Organization Name
Inventor(s)
Abhisek Kundu of Bangalore (IN)
NAVEEN Mellempudi of Bangalore (IN)
DHEEVATSA Mudigere of Bangalore (IN)
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION - A simplified explanation of the abstract
This abstract first appeared for US patent application 18532795 titled 'INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION
Simplified Explanation
The abstract describes a patent application for a technology that determines scale factors for neural network layers and converts tensor data to a different datatype for processing.
- The technology involves determining per-layer scale factors for tensor data in a neural network model.
- It converts the tensor data from floating point to an 8-bit datatype.
- The converted tensor data is used to generate an output tensor based on the scale factors.
Potential Applications
This technology could be applied in various fields such as image recognition, natural language processing, and autonomous driving systems.
Problems Solved
This technology addresses the challenge of optimizing neural network processing by efficiently converting tensor data to a more compact datatype.
Benefits
The benefits of this technology include improved computational efficiency, reduced memory usage, and potentially faster processing speeds for neural networks.
Potential Commercial Applications
A potential commercial application of this technology could be in the development of edge computing devices for real-time AI applications.
Possible Prior Art
One possible prior art for this technology could be research on optimizing neural network processing through datatype conversion and scaling techniques.
What are the specific operations involved in determining the per-layer scale factor for tensor data?
The specific operations involved in determining the per-layer scale factor for tensor data include analyzing the data distribution within each layer, calculating the appropriate scaling factor based on the distribution, and applying the scale factor to the tensor data.
How does converting tensor data to an 8-bit datatype impact the overall performance of the neural network model?
Converting tensor data to an 8-bit datatype can reduce memory usage and improve computational efficiency by allowing for faster data processing and storage within the neural network model. This optimization can lead to overall performance improvements in terms of speed and resource utilization.
Original Abstract Submitted
One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.