Samsung electronics co., ltd. (20240095519). EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR simplified abstract

From WikiPatents
Jump to navigation Jump to search

EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Organization Name

samsung electronics co., ltd.

Inventor(s)

Ardavan Pedram of Santa Clara CA (US)

Ali Shafiee Ardestani of San Jose CA (US)

Jong Hoon Shin of San Jose CA (US)

Joseph H. Hassoun of Los Gatos CA (US)

EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240095519 titled 'EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Simplified Explanation

The neural network inference accelerator described in the patent application includes two neural processing units (NPUs) and a sparsity management unit. The first NPU processes tensors with activation and weight data that have sparsity densities higher than a predetermined threshold, while the second NPU processes tensors with data that have sparsity densities equal to or lower than the threshold. The sparsity management unit controls the transfer of tensors between the NPUs based on the sparsity densities of the data.

  • The first NPU handles tensors with activation and weight data that have sparsity densities above a predetermined threshold.
  • The second NPU processes tensors with data that have sparsity densities equal to or below the predetermined threshold.
  • The sparsity management unit regulates the transfer of tensors between the NPUs based on the sparsity densities of the data.

Potential Applications

This technology can be applied in:

  • Edge computing devices
  • Autonomous vehicles
  • Robotics

Problems Solved

This technology helps in:

  • Efficient processing of sparse data
  • Accelerating neural network inference tasks

Benefits

The benefits of this technology include:

  • Improved performance in neural network inference
  • Reduced energy consumption
  • Enhanced efficiency in processing sparse data

Potential Commercial Applications

  • AI accelerators for edge devices
  • Neural network processors for IoT devices

Possible Prior Art

One possible prior art for this technology could be:

  • Research papers on sparsity management in neural networks

Unanswered Questions

How does the sparsity management unit determine the transfer of tensors between the NPUs?

The patent application does not provide specific details on the exact mechanism used by the sparsity management unit to control the transfer of tensors based on sparsity densities.

What are the specific performance improvements achieved by this neural network inference accelerator compared to existing solutions?

The patent application does not elaborate on the quantitative performance gains or benchmarks achieved by this technology in neural network inference tasks.


Original Abstract Submitted

a neural network inference accelerator includes first and second neural processing units (npus) and a sparsity management unit. the first npu receives activation and weight tensors based on an activation sparsity density and a weight sparsity density both being greater than a predetermined sparsity density. the second npu receives activation and weight tensors based on at least one of the activation sparsity density and the weight sparsity density being less than or equal to the predetermined sparsity density. the sparsity management unit controls transfer of the activation tensor and the weight tensor based on the activation sparsity density and the weight sparsity density with respect to the predetermined sparsity density.