18244171. MIXTURE OF EXPERTS NEURAL NETWORKS simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

MIXTURE OF EXPERTS NEURAL NETWORKS

Organization Name

GOOGLE LLC

Inventor(s)

Noam M. Shazeer of Palo Alto CA (US)

Azalia Mirhoseini of Mountain View CA (US)

Krzysztof Stanislaw Maziarz of Jaslo (PL)

MIXTURE OF EXPERTS NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18244171 titled 'MIXTURE OF EXPERTS NEURAL NETWORKS

Simplified Explanation

The patent application describes a system that includes a neural network with a Mixture of Experts (MoE) subnetwork. The MoE subnetwork consists of multiple expert neural networks that process the output of the first neural network layer.

  • The MoE subnetwork includes expert neural networks that process the first layer output to generate expert outputs.
  • A gating subsystem selects one or more expert neural networks based on the first layer output and assigns a weight to each selected expert neural network.
  • The first layer output is provided as input to each selected expert neural network.
  • The expert outputs generated by the selected expert neural networks are combined according to their weights to generate an MoE output.
  • The MoE output is then used as input to the second neural network layer.

Potential Applications

  • This technology can be applied in various fields where complex data processing is required, such as image recognition, natural language processing, and speech recognition.
  • It can be used in autonomous vehicles for tasks like object detection and classification.
  • It can be utilized in recommendation systems to provide personalized recommendations based on user preferences.

Problems Solved

  • The Mixture of Experts (MoE) subnetwork allows for more efficient and accurate processing of complex data by utilizing multiple expert neural networks.
  • It addresses the challenge of handling diverse and complex data by combining the outputs of different expert neural networks.
  • The gating subsystem helps in selecting the most relevant expert neural networks based on the input, improving the overall performance of the system.

Benefits

  • The system improves the accuracy and efficiency of data processing by leveraging the strengths of multiple expert neural networks.
  • It allows for better handling of complex and diverse data by combining the outputs of different expert neural networks.
  • The gating subsystem enhances the adaptability of the system by dynamically selecting the most suitable expert neural networks based on the input.


Original Abstract Submitted

A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.