17989675. EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR simplified abstract (Samsung Electronics Co., Ltd.)

From WikiPatents
Jump to navigation Jump to search

EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Organization Name

Samsung Electronics Co., Ltd.

Inventor(s)

Ardavan Pedram of Santa Clara CA (US)

Ali Shafiee Ardestani of San Jose CA (US)

Jong Hoon Shin of San Jose CA (US)

Joseph H. Hassoun of Los Gatos CA (US)

EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR - A simplified explanation of the abstract

This abstract first appeared for US patent application 17989675 titled 'EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Simplified Explanation

The neural network inference accelerator described in the abstract includes two neural processing units (NPUs) and a sparsity management unit. The first NPU processes tensors with activation and weight data that have sparsity densities higher than a predetermined threshold, while the second NPU processes tensors with data that have sparsity densities equal to or lower than the threshold. The sparsity management unit controls the transfer of data between the NPUs based on the sparsity densities.

  • First NPU processes tensors with activation and weight data above a predetermined sparsity density threshold.
  • Second NPU processes tensors with activation and weight data at or below the predetermined sparsity density threshold.
  • Sparsity management unit controls data transfer between NPUs based on sparsity densities.

Potential Applications

This technology can be applied in:

  • Edge computing devices
  • Autonomous vehicles
  • Robotics

Problems Solved

This technology helps in:

  • Increasing efficiency of neural network inference
  • Reducing computational resources required
  • Improving performance of AI applications

Benefits

The benefits of this technology include:

  • Faster inference processing
  • Lower power consumption
  • Enhanced accuracy in AI tasks

Potential Commercial Applications

Optimized for:

  • AI accelerators for data centers
  • Edge AI devices
  • IoT devices

Possible Prior Art

One possible prior art could be the use of sparsity management techniques in neural network accelerators to improve efficiency and performance.

Unanswered Questions

How does this technology compare to existing neural network inference accelerators in terms of speed and efficiency?

This article does not provide a direct comparison with existing accelerators in terms of speed and efficiency. Further research or testing may be needed to determine the exact performance metrics.

What are the potential limitations or drawbacks of implementing this technology in real-world applications?

The article does not discuss any potential limitations or drawbacks of implementing this technology. Additional studies or practical implementations may reveal challenges that need to be addressed.


Original Abstract Submitted

A neural network inference accelerator includes first and second neural processing units (NPUs) and a sparsity management unit. The first NPU receives activation and weight tensors based on an activation sparsity density and a weight sparsity density both being greater than a predetermined sparsity density. The second NPU receives activation and weight tensors based on at least one of the activation sparsity density and the weight sparsity density being less than or equal to the predetermined sparsity density. The sparsity management unit controls transfer of the activation tensor and the weight tensor based on the activation sparsity density and the weight sparsity density with respect to the predetermined sparsity density.