18368801. EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE (Cisco Technology, Inc.)

From WikiPatents
Jump to navigation Jump to search

EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE

Organization Name

Cisco Technology, Inc.

Inventor(s)

Myungjin Lee of Bellevue WA US

Jayanth Srinivasa of San Jose CA US

Ali Payani of Santa Clara CA US

Ramana Rao V.R. Kompella of Foster CA US

EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE

This abstract first appeared for US patent application 18368801 titled 'EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE

Original Abstract Submitted

In one implementation, a controller determines performance of a partitioned neural network. The controller identifies, based on the performance, a particular partition of the partitioned neural network as a bottleneck. The controller configures a first device to execute a replica of the particular partition. The controller configures a multiplexer that provides an output of the particular partition or the replica of the particular partition as input to a downstream partition of the partitioned neural network.