18368801. EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE (Cisco Technology, Inc.)
Contents
EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE
Organization Name
Inventor(s)
Myungjin Lee of Bellevue WA US
Jayanth Srinivasa of San Jose CA US
Ali Payani of Santa Clara CA US
Ramana Rao V.R. Kompella of Foster CA US
EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE
This abstract first appeared for US patent application 18368801 titled 'EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE
Original Abstract Submitted
In one implementation, a controller determines performance of a partitioned neural network. The controller identifies, based on the performance, a particular partition of the partitioned neural network as a bottleneck. The controller configures a first device to execute a replica of the particular partition. The controller configures a multiplexer that provides an output of the particular partition or the replica of the particular partition as input to a downstream partition of the partitioned neural network.