INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION

Abstract: one embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. the tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. the instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.

Inventor(s): Abhisek KUNDU, NAVEEN MELLEMPUDI, DHEEVATSA MUDIGERE, Dipankar DAS

CPC Classification: G06N3/08 (Learning methods)

Search for rejections for patent application number 20250173567