20240013782. History-Based ASR Mistake Corrections simplified abstract (Google LLC)

From WikiPatents
Jump to navigation Jump to search

History-Based ASR Mistake Corrections

Organization Name

Google LLC

Inventor(s)

Patrick Siegler of Zurich (CH)

Aurélien Boffy of Basel (CH)

Ágoston Weisz of Zurich (CH)

History-Based ASR Mistake Corrections - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240013782 titled 'History-Based ASR Mistake Corrections

Simplified Explanation

The patent application describes a method for processing follow-on audio data captured by a digital assistant-enabled device. This follow-on audio data corresponds to a query spoken by the user after submitting a previous query. The method involves using a speech recognizer to generate multiple candidate hypotheses for the follow-on query. Each candidate hypothesis represents a possible transcription for the query. The method also determines a similarity metric between the previous query and each candidate hypothesis, and based on these metrics, determines a transcription of the follow-on query spoken by the user.

  • The method involves receiving follow-on audio data from a user of a digital assistant-enabled device.
  • The follow-on audio data corresponds to a query spoken by the user after submitting a previous query.
  • The method uses a speech recognizer to process the follow-on audio data and generate multiple candidate hypotheses.
  • Each candidate hypothesis represents a possible transcription for the follow-on query.
  • The method determines a similarity metric between the previous query and each candidate hypothesis.
  • Based on the similarity metrics, the method determines a transcription of the follow-on query spoken by the user.

Potential applications of this technology:

  • Improved voice recognition and transcription capabilities for digital assistants.
  • Enhanced user experience by accurately transcribing follow-on queries.
  • Streamlined interaction with digital assistants, allowing users to seamlessly continue conversations.

Problems solved by this technology:

  • Inaccurate transcription of follow-on queries spoken by users.
  • Difficulty in understanding and processing follow-on queries in the context of previous queries.
  • Challenging speech recognition in noisy environments or with varying speech patterns.

Benefits of this technology:

  • Improved accuracy and efficiency in transcribing follow-on queries.
  • Enhanced user satisfaction and productivity with digital assistants.
  • Better understanding and context-awareness of user queries for personalized assistance.


Original Abstract Submitted

a method includes receiving follow-on audio data captured by an assistant-enabled device, the follow-on audio data corresponding to a follow-on query spoken by a user of the assistant-enabled device to a digital assistant subsequent to the user submitting a previous query to the digital assistant. the method also includes processing, using a speech recognizer, the follow-on audio data to generate multiple candidate hypotheses, each candidate hypothesis corresponding to a candidate transcription for the follow-on query and represented by a respective sequence of hypothesized terms. for each corresponding candidate hypothesis among the multiple candidate hypotheses, the method also includes determining a corresponding similarity metric between the previous query and the corresponding candidate hypothesis and determining a transcription of the follow-on query spoken by the user based on the similarity metrics determined for the multiple candidate hypotheses.