17960947. SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS simplified abstract (Intel Corporation)

From WikiPatents
Jump to navigation Jump to search

SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS

Organization Name

Intel Corporation

Inventor(s)

NAVEEN Mellempudi of Bangalore (IN)

DIPANKAR Das of Pune (IN)

SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 17960947 titled 'SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS

Simplified Explanation

The abstract describes a graphics processor with a single instruction, multiple thread architecture that includes hardware multithreading. The processor can execute parallel threads of instructions and includes functional units, including a mixed precision tensor processor, to perform tensor computations and generate loss data. The loss data is stored as a first floating-point data type and scaled by a scaling factor to represent the data distribution of a gradient tensor.

  • The graphics processor has a single instruction, multiple thread (SIMT) architecture with hardware multithreading.
  • It can execute parallel threads of instructions from a command stream.
  • The processor includes functional units, including a mixed precision tensor processor.
  • The tensor processor performs tensor computations and generates loss data.
  • The loss data is stored as a first floating-point data type.
  • The loss data is scaled by a scaling factor to represent the data distribution of a gradient tensor.

Potential Applications

  • Graphics processing for gaming and virtual reality applications.
  • Machine learning and deep learning tasks that involve tensor computations.
  • Scientific simulations and data analysis that require high-performance computing.

Problems Solved

  • Efficient execution of parallel threads of instructions.
  • Accelerated tensor computations for machine learning and deep learning tasks.
  • Improved representation of data distribution in gradient tensors.

Benefits

  • Faster and more efficient graphics processing.
  • Improved performance in machine learning and deep learning tasks.
  • Enhanced accuracy in representing data distribution in gradient tensors.


Original Abstract Submitted

A graphics processor is described that includes a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The multiprocessor can execute parallel threads of instructions associated with a command stream, where the multiprocessor includes a set of functional units to execute at least one of the parallel threads of the instructions. The set of functional units can include a mixed precision tensor processor to perform tensor computations to generate loss data. The loss data is stored as a first floating-point data type and scaled by a scaling factor to enable a data distribution of a gradient tensor generated based on the loss data to be represented by a second floating point data type.