18257284. SPLIT NEURAL NETWORK ACCELERATION ARCHITECTURE SCHEDULING AND DYNAMIC INFERENCE ROUTING simplified abstract (QUALCOMM Incorporated)

From WikiPatents
Jump to navigation Jump to search

SPLIT NEURAL NETWORK ACCELERATION ARCHITECTURE SCHEDULING AND DYNAMIC INFERENCE ROUTING

Organization Name

QUALCOMM Incorporated

Inventor(s)

Vijaya Kumar Kilari of Bangalore (IN)

Fan Wu of Broomfield CO (US)

Geoffrey Carlton Berry of Durham NC (US)

Hemanth Puranik of Bangalore (IN)

SPLIT NEURAL NETWORK ACCELERATION ARCHITECTURE SCHEDULING AND DYNAMIC INFERENCE ROUTING - A simplified explanation of the abstract

This abstract first appeared for US patent application 18257284 titled 'SPLIT NEURAL NETWORK ACCELERATION ARCHITECTURE SCHEDULING AND DYNAMIC INFERENCE ROUTING

Simplified Explanation

The abstract describes a method for accelerating machine learning on a computing device by splitting a neural network into sub-neural networks and hosting them in inference accelerators.

  • Accessing a neural network
  • Splitting the neural network into N sub-neural networks
  • Hosting the N sub-neural networks in M inference accelerators
  • Scheduling the N sub-neural networks in the M inference accelerators
  • Executing the N sub-neural networks in the M inference accelerators

Potential Applications

This technology could be applied in various fields such as image recognition, natural language processing, and autonomous driving systems.

Problems Solved

This technology addresses the issue of slow processing speeds in machine learning tasks, allowing for faster and more efficient execution of neural networks.

Benefits

The method described in the patent application can significantly improve the performance of machine learning algorithms, leading to quicker results and enhanced accuracy.

Potential Commercial Applications

One potential commercial application of this technology could be in the development of advanced AI systems for industries such as healthcare, finance, and cybersecurity.

Possible Prior Art

Prior art in this field may include research papers or patents related to optimizing neural network execution on computing devices.

What is the impact of this technology on existing machine learning models?

This technology can greatly enhance the performance of existing machine learning models by accelerating their execution speed and improving overall efficiency.

How does this method compare to other approaches for accelerating machine learning tasks?

This method stands out by efficiently splitting the neural network into sub-networks and hosting them in inference accelerators, which can lead to significant speed improvements compared to traditional methods.


Original Abstract Submitted

A method for accelerating machine learning on a computing device is described. The method includes accessing a neural network. The method also includes splitting the neural network into N sub-neural networks. The method further includes hosting the N sub-neural networks in M inference accelerators. The method also includes scheduling the N sub-neural networks in the M inference accelerators. The method further includes executing the N sub-neural networks in the M inference accelerators.