17960947. SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS simplified abstract (Intel Corporation)
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS
Organization Name
Inventor(s)
NAVEEN Mellempudi of Bangalore (IN)
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS - A simplified explanation of the abstract
This abstract first appeared for US patent application 17960947 titled 'SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS
Simplified Explanation
The abstract describes a graphics processor with a single instruction, multiple thread architecture that includes hardware multithreading. The processor can execute parallel threads of instructions and includes functional units, including a mixed precision tensor processor, to perform tensor computations and generate loss data. The loss data is stored as a first floating-point data type and scaled by a scaling factor to represent the data distribution of a gradient tensor.
- The graphics processor has a single instruction, multiple thread (SIMT) architecture with hardware multithreading.
- It can execute parallel threads of instructions from a command stream.
- The processor includes functional units, including a mixed precision tensor processor.
- The tensor processor performs tensor computations and generates loss data.
- The loss data is stored as a first floating-point data type.
- The loss data is scaled by a scaling factor to represent the data distribution of a gradient tensor.
Potential Applications
- Graphics processing for gaming and virtual reality applications.
- Machine learning and deep learning tasks that involve tensor computations.
- Scientific simulations and data analysis that require high-performance computing.
Problems Solved
- Efficient execution of parallel threads of instructions.
- Accelerated tensor computations for machine learning and deep learning tasks.
- Improved representation of data distribution in gradient tensors.
Benefits
- Faster and more efficient graphics processing.
- Improved performance in machine learning and deep learning tasks.
- Enhanced accuracy in representing data distribution in gradient tensors.
Original Abstract Submitted
A graphics processor is described that includes a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The multiprocessor can execute parallel threads of instructions associated with a command stream, where the multiprocessor includes a set of functional units to execute at least one of the parallel threads of the instructions. The set of functional units can include a mixed precision tensor processor to perform tensor computations to generate loss data. The loss data is stored as a first floating-point data type and scaled by a scaling factor to enable a data distribution of a gradient tensor generated based on the loss data to be represented by a second floating point data type.