MODEL COMPRESSION VIA QUANTIZED SPARSE PRINCIPAL COMPONENT ANALYSIS

Organization Name

Inventor(s)

Marinus Willem Van Baalen of Amsterdam (NL)

MODEL COMPRESSION VIA QUANTIZED SPARSE PRINCIPAL COMPONENT ANALYSIS - A simplified explanation of the abstract

This abstract first appeared for US patent application 17960052 titled 'MODEL COMPRESSION VIA QUANTIZED SPARSE PRINCIPAL COMPONENT ANALYSIS

Simplified Explanation

The abstract describes a method implemented by a processor for an artificial neural network (ANN). The method involves retrieving a dense quantized matrix and a sparse quantized matrix, which represent a codebook and linear coefficients respectively, for a layer of the ANN. These matrices are associated with a weight tensor of the layer. The method then determines the weight tensor by multiplying the dense and sparse quantized matrices. Finally, the method processes an input at the layer using the weight tensor.

The method involves retrieving and manipulating quantized matrices for a layer of an artificial neural network.
The matrices represent a codebook and linear coefficients associated with a weight tensor.
The weight tensor is determined by multiplying the dense and sparse quantized matrices.
The method allows for processing inputs at the layer using the weight tensor.

Potential Applications

Artificial intelligence and machine learning systems
Deep learning algorithms
Image and speech recognition systems
Natural language processing

Problems Solved

Efficient representation and manipulation of weight tensors in artificial neural networks
Reducing memory and computational requirements for neural network operations
Improving the performance and speed of deep learning algorithms

Benefits

Improved efficiency in artificial neural network operations
Reduced memory and computational requirements
Faster processing and improved performance in deep learning tasks

Original Abstract Submitted

A processor-implemented method includes retrieving, for a layer of a set of layers of an artificial neural network (ANN), a dense quantized matrix representing a codebook and a sparse quantized matrix representing linear coefficients. The dense quantized matrix and the sparse quantized matrix may be associated with a weight tensor of the layer. The processor-implemented method also includes determining, for the layer of the set of layers, the weight tensor based on a product of the dense quantized matrix and the sparse quantized matrix. The processor-implemented method further includes processing, at the layer, an input based on the weight tensor.

17960052. MODEL COMPRESSION VIA QUANTIZED SPARSE PRINCIPAL COMPONENT ANALYSIS simplified abstract (QUALCOMM Incorporated)

Contents

MODEL COMPRESSION VIA QUANTIZED SPARSE PRINCIPAL COMPONENT ANALYSIS

Organization Name

Inventor(s)