Jump to content

18590918. Self-Training With Oracle And Top-Ranked Hypotheses simplified abstract (GOOGLE LLC)

From WikiPatents

Self-Training With Oracle And Top-Ranked Hypotheses

Organization Name

GOOGLE LLC

Inventor(s)

Andrew M. Rosenberg of Brooklyn NY (US)

Murali Karthick Baskar of Mountain View CA (US)

Bhuvana Ramabhadran of Mt. Kisco NY (US)

Self-Training With Oracle And Top-Ranked Hypotheses - A simplified explanation of the abstract

This abstract first appeared for US patent application 18590918 titled 'Self-Training With Oracle And Top-Ranked Hypotheses

The method described in the patent application involves using an RNN-T model to process sequences of acoustic frames for speech recognition training samples, resulting in n-best lists of hypotheses with corresponding word error rates relative to ground-truth transcriptions.

  • The method calculates a first loss for the top-ranked hypothesis in the n-best list based on the ground-truth transcription.
  • An oracle hypothesis is identified as the one with the fewest word errors in the n-best list, and a second loss is determined for this hypothesis based on the ground-truth transcription.
  • A self-training combined loss is calculated based on the first and second losses, and the model is trained using this combined loss.

Key Features and Innovation:

  • Utilizes an RNN-T model for processing acoustic frames in speech recognition training.
  • Introduces the concept of an oracle hypothesis with the lowest word error rate for training.
  • Calculates a self-training combined loss to improve model training accuracy.

Potential Applications:

  • Speech recognition systems
  • Language translation applications
  • Voice-controlled devices

Problems Solved:

  • Improving speech recognition accuracy
  • Enhancing training efficiency for RNN-T models

Benefits:

  • Higher accuracy in speech recognition
  • More efficient model training process
  • Enhanced performance in language translation tasks

Commercial Applications:

  • Optimizing speech recognition software for improved user experience
  • Developing advanced language translation tools for commercial use

Prior Art: Prior research in RNN-T models for speech recognition and language processing.

Frequently Updated Research: Ongoing studies on improving speech recognition systems using neural network models.

Questions about the Technology: 1. How does the use of an oracle hypothesis improve model training in speech recognition? 2. What are the potential implications of the self-training combined loss on speech recognition accuracy?


Original Abstract Submitted

A method includes, for each training sample of a plurality of training samples, processing, using an RNN-T model, a corresponding sequence of acoustic frames to obtain an n-best list of speech recognition hypotheses, and, for each speech recognition hypothesis of the n-best list, determining a corresponding number of word errors relative to a corresponding ground-truth transcription. For a top-ranked hypothesis from the n-best list, the method includes determining a first loss based on the corresponding ground-truth transcription. The method includes identifying, as an oracle hypothesis, the speech recognition hypothesis from the n-best list having the smallest corresponding number of word errors relative to the corresponding ground-truth transcription, and determining a second loss for the oracle hypothesis based on the corresponding ground-truth transcription. The method includes determining a corresponding self-training combined loss based on the first and second losses, and training the model based on the corresponding self-training combined loss.

Cookies help us deliver our services. By using our services, you agree to our use of cookies.