Instruction Set Architecture for Neural Network Quantization and Packing

Organization Name

QUALCOMM Incorporated

Inventor(s)

Srijesh Sudarsanan of Waltham MA (US)

Deepak Mathew of Acton MA (US)

Marc Hoffman of Mansfield MA (US)

Sundar Rajan Balasubramanian of Groton MA (US)

Mansi Jain of Littleton MA (US)

James Lee of Northborough MA (US)

Gerald Sweeney of Chelmsford MA (US)

Instruction Set Architecture for Neural Network Quantization and Packing - A simplified explanation of the abstract

This abstract for appeared for US patent application number 17732361 Titled 'Instruction Set Architecture for Neural Network Quantization and Packing'

Simplified Explanation

This application describes a method for performing computational operations on a neural network using a single instruction. The method involves an electronic device receiving an instruction to apply a neural network operation to a set of M-bit elements stored in input vector registers. The device then implements the operation by obtaining the M-bit elements, quantizing them from M bits to P bits (where P is smaller than M), and packing the resulting P-bit elements into an output vector register. The neural network operation may involve multiplying the elements by a quantization factor and adding a zero point.

Original Abstract Submitted

This application is directed to using a single instruction to initiate a sequence of computational operations related to a neural network. An electronic device receives a single instruction to apply a neural network operation to a set of M-bit elements stored in one or more input vector registers. In response to the single instruction, the electronic device implements the neural network operation on the set of M-bit elements to generate a set of P-bit elements by obtaining the set of M-bit elements from the one or more input vector registers, quantizing each of the set of M-bit elements from M bits to P bits, and packing the set of P-bit elements into an output vector register. P is smaller than M. In some embodiments, the neural network operation is a quantization operation including at least a multiplication with a quantization factor and an addition with a zero point.

US Patent Application 17732361. Instruction Set Architecture for Neural Network Quantization and Packing simplified abstract

Contents

Instruction Set Architecture for Neural Network Quantization and Packing

Organization Name

Inventor(s)

Instruction Set Architecture for Neural Network Quantization and Packing - A simplified explanation of the abstract

Simplified Explanation

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools