US Patent Application 18315625. COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS simplified abstract

From WikiPatents
Jump to navigation Jump to search

COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS

Organization Name

Intel Corporation


Inventor(s)

Kevin Nealis of San Jose CA (US)

Anbang Yao of Beijing (CN)

Xiaoming Chen of Shanghai (CN)

Elmoustapha Ould-ahmed-vall of Chandler AZ (US)

Sara S. Baghsorkhi of San Jose CA (US)

Eriko Nurvitadhi of Hillsboro OR (US)

Balaji Vembu of Folsom CA (US)

Nicolas C. Galoppo Von Borries of Portland OR (US)

Rajkishore Barik of Santa Clara CA (US)

Tsung-Han Lin of Campbell CA (US)

Kamal Sinha of Cordova CA (US)

COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18315625 titled 'COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS

Simplified Explanation

The abstract describes a compute apparatus that can execute instructions for neural networks efficiently.

  • The compute apparatus has a decode unit that can decode a single instruction into multiple operands, including a multi-bit input value and a one-bit weight associated with a neural network.
  • The arithmetic logic unit in the compute apparatus includes a multiplier, an adder, and an accumulator register.
  • To execute the decoded instruction, the multiplier performs a fused operation that combines an exclusive not OR (XNOR) operation and a population count operation.
  • The adder adds the intermediate product from the multiplier to a value stored in the accumulator register and updates the value stored in the accumulator register.


Original Abstract Submitted

One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a one-bit weight associated with a neural network, as well as an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a fused operation including an exclusive not OR (XNOR) operation and a population count operation. The adder is configured to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.