17958887. IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER

Organization Name

GOOGLE LLC

Inventor(s)

Rajiv Mathews of Sunnyvale CA (US)

Rohit Prabhavalkar of Santa Clara CA (US)

Giovanni Motta of San Jose CA (US)

Mingqing Chen of Saratoga CA (US)

Lillian Zhou of Mountain View CA (US)

Dhruv Guliani of San Francisco CA (US)

Harry Zhang of Sunnyvale CA (US)

Trevor Strohman of Sunnyvale CA (US)

Françoise Beaufays of Mountain View CA (US)

IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER - A simplified explanation of the abstract

This abstract first appeared for US patent application 17958887 titled 'IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER

Simplified Explanation

The patent application describes a system for identifying and correcting misrecognitions in automatic speech recognition (ASR) systems. Here are the key points of the innovation:

  • On-device processors generate predicted textual segments based on user utterances and receive input that modifies these segments.
  • The modified textual segments are stored as candidate correction pairs and transmitted to a remote system for validation.
  • Remote processors determine actual correction pairs and update global ASR models accordingly.
  • The updated global ASR models are distributed to client devices for improved accuracy.

Potential Applications

This technology can be applied in various fields such as customer service, virtual assistants, transcription services, and language learning platforms.

Problems Solved

1. Improves the accuracy of ASR systems by identifying and correcting misrecognitions. 2. Enhances user experience by providing more accurate transcriptions and responses.

Benefits

1. Increased efficiency in speech recognition tasks. 2. Improved user satisfaction with ASR technology. 3. Enhanced performance of ASR systems over time.

Potential Commercial Applications

Optimizing ASR technology for customer service centers to improve call transcription accuracy and response quality.

Possible Prior Art

One potential prior art could be the use of machine learning algorithms to improve ASR accuracy, but the specific approach of identifying and correcting misrecognitions may be novel.

Unanswered Questions

How does the system handle different languages and accents in speech recognition?

The patent application does not provide details on how the system adapts to various languages and accents during the correction process.

What security measures are in place to protect user data during the correction process?

The patent application does not address the security protocols implemented to safeguard user data while transmitting and storing candidate correction pairs.


Original Abstract Submitted

Implementations described herein identify and correct automatic speech recognition (ASR) misrecognitions. For example, on-device processor(s) of a client device may generate a predicted textual segment that is predicted to correspond to spoken utterance of a user of the client device, and may receive further input that modifies the predicted textual segment to an alternate textual segment. Further, the on-device processor(s) may store these textual segments in on-device storage as a candidate correction pair, and transmit the candidate correction pair to a remote system. Moreover, remote processor(s) of the remote system may determine that the candidate correction pair is an actual correction pair, and may cause client devices to generate updates for a global ASR model for the candidate correction pair. Additionally, the remote processor(s) may distribute the global ASR model to the client devices and/or additional client devices.