18199886. DETERMINING HYPERPARAMETERS USING SEQUENCE GENERATION NEURAL NETWORKS simplified abstract (Google LLC)

From WikiPatents
Jump to navigation Jump to search

DETERMINING HYPERPARAMETERS USING SEQUENCE GENERATION NEURAL NETWORKS

Organization Name

Google LLC

Inventor(s)

Yutian Chen of Cambridge (GB)

Xingyou Song of Jersey City NJ (US)

Chansoo Lee of Pittsburgh PA (US)

Zi Wang of Cambridge MA (US)

Qiuyi Zhang of Pittsburgh PA (US)

David Martin Dohan of San Francisco CA (US)

Sagi Perel of Pittsburgh PA (US)

Joao Ferdinando Gomes De Freitas of London (GB)

DETERMINING HYPERPARAMETERS USING SEQUENCE GENERATION NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18199886 titled 'DETERMINING HYPERPARAMETERS USING SEQUENCE GENERATION NEURAL NETWORKS

Simplified Explanation

The abstract describes a method for training a machine learning model using a sequence generation neural network. Here is a simplified explanation of the abstract:

  • The method involves receiving metadata for training and generating a metadata sequence to represent the metadata.
  • At each iteration, the method generates one or more trials that specify values for a set of hyperparameters.
  • For each trial, an input sequence is generated that includes the metadata sequence and the values for the hyperparameters specified by earlier trials.
  • The input sequence is processed using a sequence generation neural network to generate an output sequence representing the hyperparameter values.

Potential applications of this technology:

  • Training machine learning models: This method can be used to train machine learning models by automatically generating and optimizing hyperparameters.
  • Data analysis: The method can be applied to analyze large datasets and find the best hyperparameter values for different models.
  • Optimization algorithms: The approach can be used in various optimization algorithms that require tuning hyperparameters.

Problems solved by this technology:

  • Manual hyperparameter tuning: This method eliminates the need for manual tuning of hyperparameters, which can be time-consuming and error-prone.
  • Trial and error approach: The method automates the process of trying different hyperparameter values and finding the best combination, reducing the need for trial and error.

Benefits of this technology:

  • Efficiency: The method automates the hyperparameter tuning process, saving time and effort for researchers and developers.
  • Improved performance: By optimizing hyperparameters, the method can lead to improved performance of machine learning models.
  • Scalability: The approach can be applied to large datasets and complex models, making it scalable for various applications.


Original Abstract Submitted

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. One of the methods includes receiving metadata for the training, generating a metadata sequence that represents the metadata, at each of a plurality of iterations: generating one or more trials that each specify a respective value for each of a set of hyperparameters, comprising, for each trial: generating an input sequence for the iteration that comprises (i) the metadata sequence and (ii) for any earlier trials, a respective sequence that represents the respective values for the hyperparameters specified by the earlier trial and a measure of performance for the trial, and processing an input sequence for the trial that comprises the input sequence for the iteration using a sequence generation neural network to generate an output sequence that represents respective values for the hyperparameters.