Google LLC (20240379095). CONTEXTUAL BIASING FOR SPEECH RECOGNITION simplified abstract
Contents
CONTEXTUAL BIASING FOR SPEECH RECOGNITION
Organization Name
Inventor(s)
Rohit Prakash Prabhavalkar of Santa Clara CA (US)
Golan Pundak of New York NY (US)
Tara N. Sainath of Jersey City NJ (US)
CONTEXTUAL BIASING FOR SPEECH RECOGNITION - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240379095 titled 'CONTEXTUAL BIASING FOR SPEECH RECOGNITION
The method described in the abstract involves using audio data encoding to process an utterance along with a set of bias phrases related to the context of the utterance. This process includes utilizing a speech recognition model that incorporates acoustic features from the audio to generate a transcript of the utterance.
- The method receives audio data encoding an utterance and obtains a set of bias phrases corresponding to the context of the utterance.
- Each bias phrase consists of one or more words.
- Acoustic features derived from the audio are processed using a speech recognition model.
- The speech recognition model includes a first encoder for the acoustic features, a bias encoder for the obtained bias phrases, and a decoder to determine likelihoods of speech element sequences.
- The transcript for the utterance is determined based on the likelihoods of speech element sequences.
Potential Applications: - Speech recognition technology - Contextual understanding in audio processing - Bias detection in language processing
Problems Solved: - Improving accuracy in speech recognition - Enhancing contextual understanding in audio data processing - Addressing bias detection in language models
Benefits: - Increased accuracy in transcribing audio data - Improved contextual understanding of utterances - Enhanced bias detection capabilities in language processing
Commercial Applications: Title: Enhanced Speech Recognition and Bias Detection Technology This technology can be utilized in various industries such as: - Customer service for accurate transcription of customer calls - Legal transcription services for precise documentation - Educational institutions for improved language processing in learning tools
Questions about the technology: 1. How does this method improve bias detection in language processing? 2. What are the potential implications of using bias phrases in speech recognition models?
Frequently Updated Research: Stay updated on advancements in speech recognition technology and bias detection in language processing to enhance the efficiency and accuracy of audio data transcription.
Original Abstract Submitted
a method includes receiving audio data encoding an utterance and obtaining a set of bias phrases corresponding to a context of the utterance. each bias phrase includes one or more words. the method also includes processing, using a speech recognition model, acoustic features derived from the audio to generate an output from the speech recognition model. the speech recognition model includes a first encoder configured to receive the acoustic features, a bias encoder configured to receive data indicating the obtained set of bias phrases, a bias encoder, and a decoder configured to determine likelihoods of sequences of speech elements based on output of the first attention module and output of the bias attention module. the method also includes determining a transcript for the utterance based on the likelihoods of sequences of speech elements.