18506540. ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS simplified abstract (Google LLC)

From WikiPatents
Jump to navigation Jump to search

ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS

Organization Name

Google LLC

Inventor(s)

Georg Heigold of Mountain View CA (US)

Erik Mcdermott of San Francisco CA (US)

Vincent O. Vanhoucke of San Francisco CA (US)

Andrew W. Senior of New York NY (US)

Michiel A. U. Bacchiani of Summit NJ (US)

ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18506540 titled 'ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS

Simplified Explanation

The abstract describes a method for training speech models using neural networks to optimize parameters based on training frames representing speech features of training utterances.

  • The method involves obtaining training frames for speech features from training utterances.
  • Neural network parameters are obtained and optimized based on the training frames.
  • This process is repeated for multiple sequence-training speech models to improve speech recognition accuracy.

Potential Applications

This technology can be applied in various fields such as:

  • Speech recognition software development
  • Voice-controlled devices
  • Language translation services

Problems Solved

This technology addresses the following issues:

  • Improving accuracy of speech recognition systems
  • Enhancing the performance of neural networks in speech processing
  • Streamlining the training process for speech models

Benefits

The benefits of this technology include:

  • Increased accuracy and efficiency in speech recognition
  • Enhanced performance of voice-controlled devices
  • Improved user experience in language translation services

Potential Commercial Applications

The potential commercial applications of this technology include:

  • Developing advanced speech recognition software for businesses
  • Integrating voice-controlled features in consumer electronics
  • Providing language translation services for global markets

Possible Prior Art

One possible prior art in this field is the use of deep learning techniques for speech recognition, which has been a focus of research and development in recent years.

Unanswered Questions

How does this method compare to traditional speech recognition training techniques?

This article does not provide a direct comparison between this method and traditional speech recognition training techniques. It would be interesting to know the specific advantages or disadvantages of this approach compared to conventional methods.

What are the potential limitations of this technology in real-world applications?

The article does not address the potential limitations of implementing this technology in real-world scenarios. Understanding the challenges or constraints of using this method in practical settings would be valuable information for further analysis.


Original Abstract Submitted

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.