17670044. NEURAL NETWORK TRAINING WITH ACCELERATION simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

NEURAL NETWORK TRAINING WITH ACCELERATION

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Shiyu Li of Durham NC (US)

Krishna T. Malladi of San Jose CA (US)

Andrew Chang of Los Altos CA (US)

Yang Seok Ki of Palo Alto CA (US)

NEURAL NETWORK TRAINING WITH ACCELERATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 17670044 titled 'NEURAL NETWORK TRAINING WITH ACCELERATION

Simplified Explanation

The patent application describes a system and method for training a neural network using a combination of a graphics processing unit (GPU) cluster and a computational storage cluster. The GPU cluster consists of one or more GPUs, while the computational storage cluster consists of one or more computational storage devices. These clusters are connected by a cache-coherent system interconnect.

  • The system includes a GPU cluster and a computational storage cluster connected by a cache-coherent system interconnect.
  • The GPU cluster comprises one or more GPUs, while the computational storage cluster comprises one or more computational storage devices.
  • A computational storage device in the cluster is configured to store an embedding table, receive an index vector with a first and second index, and calculate an embedded vector based on the corresponding rows of the embedding table.

Potential applications of this technology:

  • Deep learning: The system can be used to train neural networks for various deep learning applications, such as image recognition, natural language processing, and recommendation systems.
  • Data analytics: The system can be utilized for training neural networks in data analytics tasks, including pattern recognition, anomaly detection, and predictive modeling.
  • Autonomous systems: This technology can be applied in training neural networks for autonomous systems like self-driving cars, drones, and robots.

Problems solved by this technology:

  • Improved performance: The combination of GPU and computational storage clusters allows for faster and more efficient training of neural networks, reducing the time required for training.
  • Scalability: The system can handle large-scale neural network training by leveraging the parallel processing capabilities of GPUs and the storage capacity of computational storage devices.
  • Reduced data movement: By storing the embedding table in the computational storage devices, the system minimizes data movement between the GPU and storage clusters, improving overall efficiency.

Benefits of this technology:

  • Faster training: The system's architecture enables accelerated training of neural networks, leading to quicker model development and deployment.
  • Cost-effective: By utilizing computational storage devices, the system optimizes the use of resources, reducing the need for additional expensive hardware.
  • Scalable and flexible: The system can be easily scaled up or down by adding or removing GPUs or computational storage devices, providing flexibility to meet varying training requirements.


Original Abstract Submitted

A system and method for training a neural network. In some embodiments, the system includes: a graphics processing unit cluster; and a computational storage cluster connected to the graphics processing unit cluster by a cache-coherent system interconnect. The graphics processing unit cluster may include one or more graphics processing units. The computational storage cluster may include one or more computational storage devices. A first computational storage device of the one or more computational storage devices may be configured to (i) store an embedding table, (ii) receive an index vector including a first index and a second index; and (iii) calculate an embedded vector based on: a first row of the embedding table, corresponding to the first index, and a second row of the embedding table, corresponding to the second index.