EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Organization Name

Inventor(s)

Ali Shafiee Ardestani of San Jose CA (US)

EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240095519 titled 'EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Simplified Explanation

The neural network inference accelerator described in the patent application includes two neural processing units (NPUs) and a sparsity management unit. The first NPU processes tensors with activation and weight data that have sparsity densities higher than a predetermined threshold, while the second NPU processes tensors with data that have sparsity densities equal to or lower than the threshold. The sparsity management unit controls the transfer of tensors between the NPUs based on the sparsity densities of the data.

The first NPU handles tensors with activation and weight data that have sparsity densities above a predetermined threshold.
The second NPU processes tensors with data that have sparsity densities equal to or below the predetermined threshold.
The sparsity management unit regulates the transfer of tensors between the NPUs based on the sparsity densities of the data.

Potential Applications

This technology can be applied in:

Edge computing devices
Autonomous vehicles
Robotics

Problems Solved

This technology helps in:

Efficient processing of sparse data
Accelerating neural network inference tasks

Benefits

The benefits of this technology include:

Improved performance in neural network inference
Reduced energy consumption
Enhanced efficiency in processing sparse data

Potential Commercial Applications

AI accelerators for edge devices
Neural network processors for IoT devices

Possible Prior Art

One possible prior art for this technology could be:

Research papers on sparsity management in neural networks

Unanswered Questions

How does the sparsity management unit determine the transfer of tensors between the NPUs?

The patent application does not provide specific details on the exact mechanism used by the sparsity management unit to control the transfer of tensors based on sparsity densities.

What are the specific performance improvements achieved by this neural network inference accelerator compared to existing solutions?

The patent application does not elaborate on the quantitative performance gains or benchmarks achieved by this technology in neural network inference tasks.

Original Abstract Submitted

a neural network inference accelerator includes first and second neural processing units (npus) and a sparsity management unit. the first npu receives activation and weight tensors based on an activation sparsity density and a weight sparsity density both being greater than a predetermined sparsity density. the second npu receives activation and weight tensors based on at least one of the activation sparsity density and the weight sparsity density being less than or equal to the predetermined sparsity density. the sparsity management unit controls transfer of the activation tensor and the weight tensor based on the activation sparsity density and the weight sparsity density with respect to the predetermined sparsity density.

Samsung electronics co., ltd. (20240095519). EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR simplified abstract

Contents

EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Organization Name

Inventor(s)