Samsung electronics co., ltd. (20240160483). DNNS ACCELERATION WITH BLOCK-WISE N:M STRUCTURED WEIGHT SPARSITY simplified abstract

From WikiPatents
Revision as of 02:51, 23 May 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

DNNS ACCELERATION WITH BLOCK-WISE N:M STRUCTURED WEIGHT SPARSITY

Organization Name

samsung electronics co., ltd.

Inventor(s)

Hamzah Abdelaziz of Santa Clara CA (US)

Joseph Hassoun of Los Gatos CA (US)

DNNS ACCELERATION WITH BLOCK-WISE N:M STRUCTURED WEIGHT SPARSITY - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240160483 titled 'DNNS ACCELERATION WITH BLOCK-WISE N:M STRUCTURED WEIGHT SPARSITY

Simplified Explanation

An accelerator core described in the patent application includes buffers and processing elements for handling block-wise sparsified elements. The first buffer receives block-wise sparsified first elements, while the second buffer receives second elements. Each group of processing elements receives rows of first elements and corresponding second elements from the buffers.

  • The accelerator core includes first and second buffers for handling block-wise sparsified elements.
  • At least one group of processing elements is present in the accelerator core.
  • The first buffer receives block-wise sparsified first elements, while the second buffer receives second elements.
  • Each group of processing elements receives rows of first elements and corresponding second elements from the buffers.

Potential Applications

This technology could be applied in image processing, signal processing, and machine learning tasks that involve handling sparse data efficiently.

Problems Solved

1. Efficient processing of block-wise sparsified elements. 2. Handling large amounts of data in a structured manner.

Benefits

1. Improved performance in processing sparse data. 2. Enhanced efficiency in handling block-wise sparsified elements.

Potential Commercial Applications

Optimizing neural networks, accelerating image recognition algorithms, and improving data processing in IoT devices could be potential commercial applications of this technology.

Possible Prior Art

Prior art in the field of parallel processing architectures and sparse data handling techniques may exist, but specific examples are not provided in the patent application.

Unanswered Questions

How does this technology compare to existing solutions for processing sparse data efficiently?

The patent application does not provide a direct comparison with existing solutions in the field.

What are the specific technical specifications and requirements for integrating this accelerator core into existing systems?

The patent application does not detail the specific technical specifications or integration requirements for implementing this technology.


Original Abstract Submitted

an accelerator core includes first and second buffers and at least one group of k processing elements. the first buffer receives at least one group of block-wise sparsified first elements. a block size (k,c) of each group of block-wise sparsified first elements includes k rows and c columns in which k is greater than or equal to 2, k times p equals k, and c times q equals c in which k is an output channel dimension of a tensor of first elements, c is a number of input channels of the tensor of first elements, p is an integer and q is an integer. the second buffer receive second elements. each respective group of processing elements receive k rows of first elements from a block of first elements corresponding to the group of pes, and receives second elements that correspond to first elements received from the first buffer.