COMPRESSION OF NEURAL NETWORKS WITH ORTHOGONAL MATRICES

Organization Name

Inventor(s)

Marcelo Gennari Do Nascimento of Cambridge GB

COMPRESSION OF NEURAL NETWORKS WITH ORTHOGONAL MATRICES

This abstract first appeared for US patent application 18536154 titled 'COMPRESSION OF NEURAL NETWORKS WITH ORTHOGONAL MATRICES

Original Abstract Submitted

Embodiment herein relate to a neural network compression technique, in which a weight matrix within the neural network is transformed via matrix multiplication with an orthogonal matrix. The orthogonal matrix is derived from a calibration dataset (which is generally chosen to be broadly representative of expected runtime input data), and the transformation is such that a resulting modified weight matrix has components ordered by relative significance. The modified weight matrix is incorporated in a compressed neural network with fewer weights. By removing one or more components of lower significance, the size of the neural network (and, therefore, its storage and execution overhead) are reduced, whilst still maintaining an acceptable level of performance.