18400767. COMPOUND MODEL SCALING FOR NEURAL NETWORKS simplified abstract (Google LLC)

From WikiPatents
Jump to navigation Jump to search

COMPOUND MODEL SCALING FOR NEURAL NETWORKS

Organization Name

Google LLC

Inventor(s)

Mingxing Tan of Newark CA (US)

Quoc V. Le of Sunnyvale CA (US)

COMPOUND MODEL SCALING FOR NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18400767 titled 'COMPOUND MODEL SCALING FOR NEURAL NETWORKS

Simplified Explanation:

The abstract describes a method for determining the final architecture of a neural network for a specific machine learning task. This involves scaling the baseline architecture using extra computational resources based on certain coefficients.

  • The method involves receiving a baseline architecture with network width, depth, and resolution dimensions.
  • Data defining a compound coefficient is received to control the extra computational resources used for scaling.
  • A search is performed to determine coefficients for network width, depth, and resolution based on the baseline architecture and compound coefficient.
  • The final architecture is generated by scaling the network dimensions according to the coefficients.

Key Features and Innovation:

  • Method for determining final architecture of a neural network for a specific machine learning task.
  • Involves scaling the baseline architecture using extra computational resources.
  • Utilizes coefficients to assign resources to network width, depth, and resolution dimensions.
  • Generates final architecture based on the coefficients determined.

Potential Applications:

This technology can be applied in various fields such as:

  • Image recognition
  • Natural language processing
  • Autonomous vehicles
  • Healthcare diagnostics
  • Financial forecasting

Problems Solved:

  • Optimizing neural network architecture for specific machine learning tasks.
  • Efficient utilization of computational resources.
  • Enhancing performance and accuracy of machine learning models.

Benefits:

  • Improved accuracy and performance of machine learning models.
  • Efficient allocation of computational resources.
  • Scalability for different machine learning tasks.

Commercial Applications:

Title: Neural Network Architecture Optimization for Machine Learning Tasks

This technology has potential commercial applications in:

  • AI software development companies
  • Research institutions
  • Data analytics firms
  • Autonomous vehicle technology companies
  • Healthcare AI solutions providers

Questions about Neural Network Architecture Optimization for Machine Learning Tasks:

1. How does this method improve the efficiency of neural network architectures for machine learning tasks? 2. What are the potential implications of using this technology in various industries?

Frequently Updated Research:

Stay updated on the latest advancements in neural network architecture optimization for machine learning tasks to ensure the most effective implementation of this technology.


Original Abstract Submitted

A method for determining a final architecture for a neural network to perform a particular machine learning task is described. The method includes receiving a baseline architecture for the neural network, wherein the baseline architecture has a network width dimension, a network depth dimension, and a resolution dimension; receiving data defining a compound coefficient that controls extra computational resources used for scaling the baseline architecture; performing a search to determine a baseline width, depth and resolution coefficient that specify how to assign the extra computational resources to the network width, depth and resolution dimensions of the baseline architecture, respectively; determining a width, depth and resolution coefficient based on the baseline width, depth, and resolution coefficient and the compound coefficient; and generating the final architecture that scales the network width, network depth, and resolution dimensions of the baseline architecture based on the corresponding width, depth, and resolution coefficients.