IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER

Organization Name

google llc

Inventor(s)

Rajiv Mathews of Sunnyvale CA (US)

Rohit Prabhavalkar of Santa Clara CA (US)

Giovanni Motta of San Jose CA (US)

Mingqing Chen of Saratoga CA (US)

Lillian Zhou of Mountain View CA (US)

Dhruv Guliani of San Francisco CA (US)

Harry Zhang of Sunnyvale CA (US)

Trevor Strohman of Sunnyvale CA (US)

Françoise Beaufays of Mountain View CA (US)

IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240112673 titled 'IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER

Simplified Explanation

The patent application describes a system for identifying and correcting misrecognitions in automatic speech recognition (ASR) systems. Here is a simplified explanation of the abstract:

On-device processors of a client device predict textual segments corresponding to user utterances and store them as candidate correction pairs.
The on-device processors transmit these pairs to a remote system, which determines actual correction pairs and updates a global ASR model.
The global ASR model is distributed to client devices for improved speech recognition accuracy.

Potential Applications

This technology could be applied in:

Improving the accuracy of voice assistants
Enhancing transcription services

Problems Solved

Correcting misrecognitions in ASR systems
Updating ASR models for better performance

Benefits

Enhanced user experience with accurate speech recognition
Increased efficiency in transcribing spoken language

Potential Commercial Applications

Voice-controlled devices
Transcription software services

Possible Prior Art

One possible prior art could be the use of machine learning algorithms to improve ASR accuracy.

Unanswered Questions

How does this technology handle different languages or accents in speech recognition?

The patent application does not specifically address how the system adapts to various languages or accents in speech recognition.

What measures are in place to ensure data privacy and security when transmitting textual segments to remote systems?

The patent application does not detail the security protocols in place for transmitting sensitive textual data to remote systems.

Original Abstract Submitted

implementations described herein identify and correct automatic speech recognition (asr) misrecognitions. for example, on-device processor(s) of a client device may generate a predicted textual segment that is predicted to correspond to spoken utterance of a user of the client device, and may receive further input that modifies the predicted textual segment to an alternate textual segment. further, the on-device processor(s) may store these textual segments in on-device storage as a candidate correction pair, and transmit the candidate correction pair to a remote system. moreover, remote processor(s) of the remote system may determine that the candidate correction pair is an actual correction pair, and may cause client devices to generate updates for a global asr model for the candidate correction pair. additionally, the remote processor(s) may distribute the global asr model to the client devices and/or additional client devices.

Google llc (20240112673). IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER simplified abstract

Contents

IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER

Organization Name

Inventor(s)

IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER - A simplified explanation of the abstract

Simplified Explanation

Potential Applications

Problems Solved

Benefits

Potential Commercial Applications

Possible Prior Art

Unanswered Questions

How does this technology handle different languages or accents in speech recognition?

What measures are in place to ensure data privacy and security when transmitting textual segments to remote systems?

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools