Intel corporation (20240160931). INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION simplified abstract
Contents
- 1 INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Unanswered Questions
- 1.11 Original Abstract Submitted
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION
Organization Name
Inventor(s)
Abhisek Kundu of Bangalore (IN)
NAVEEN Mellempudi of Bangalore (IN)
DHEEVATSA Mudigere of Bangalore (IN)
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240160931 titled 'INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION
Simplified Explanation
The patent application describes a method for optimizing neural network models by converting tensor data to a more efficient datatype and applying per-layer scale factors.
- The method involves determining a per-layer scale factor for tensor data associated with layers of a neural network model.
- The tensor data is then converted from a floating point datatype to an 8-bit datatype.
- The converted tensor data is used to generate an output tensor based on the per-layer scale factor.
Potential Applications
This technology could be applied in various fields such as image recognition, natural language processing, and autonomous vehicles.
Problems Solved
This technology addresses the problem of optimizing neural network models for efficient processing and memory usage.
Benefits
The benefits of this technology include faster inference times, reduced memory footprint, and improved energy efficiency in neural network applications.
Potential Commercial Applications
One potential commercial application of this technology is in edge computing devices where resources are limited but real-time processing is required.
Possible Prior Art
Prior art may include similar techniques for optimizing neural network models through quantization and scaling of tensor data.
Unanswered Questions
1. How does this method compare to other techniques for optimizing neural network models in terms of accuracy and efficiency? 2. Are there any limitations or drawbacks to converting tensor data to an 8-bit datatype that need to be considered in practical applications?
Original Abstract Submitted
one embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. the tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. the instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.