Google llc (20240161732). MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION simplified abstract
Contents
- 1 MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Unanswered Questions
- 1.11 Original Abstract Submitted
MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION
Organization Name
Inventor(s)
Zhifeng Chen of Sunnyvale CA (US)
Eugene Weinstein of New York NY (US)
Pedro J. Moreno Mengibar of Jersey City NJ (US)
Ron J. Weiss of New York NY (US)
Khe Chai Sim of Dublin CA (US)
Tara N. Sainath of Jersey City NJ (US)
Patrick An Phu Nguyen of Palo Alto CA (US)
MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240161732 titled 'MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION
Simplified Explanation
The patent application describes methods, systems, and apparatus for speech recognition using multi-dialect and multilingual models. In simple terms, the technology involves recognizing speech in different languages and dialects using a trained model.
- Audio data of an utterance is received.
- Input features based on the audio data are provided to a speech recognition model trained to recognize multiple languages or dialects.
- The speech recognition model outputs scores indicating the likelihood of linguistic units for each language or dialect.
- The model may have been trained using cluster adaptive training.
- A transcription of the utterance is generated based on the output of the speech recognition model.
Potential Applications
This technology can be applied in various fields such as language translation, voice-controlled devices, and transcription services.
Problems Solved
1. Overcoming language barriers in speech recognition. 2. Improving accuracy and efficiency in transcribing multilingual content.
Benefits
1. Enhanced communication across different languages. 2. Increased accessibility for non-native speakers. 3. Improved transcription accuracy for multilingual content.
Potential Commercial Applications
Optimizing customer service chatbots for multilingual support.
Possible Prior Art
One potential prior art could be the use of language models in speech recognition systems to improve accuracy and performance.
Unanswered Questions
How does the technology handle accents and regional dialects?
The patent application does not specifically address how the technology deals with accents and regional variations in speech.
What is the computational complexity of the multi-dialect and multilingual models?
The patent application does not provide information on the computational resources required to implement the technology.
Original Abstract Submitted
methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. in some implementations, audio data indicating audio characteristics of an utterance is received. input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. the speech recognition model can be one that has been trained using cluster adaptive training. output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. a transcription of the utterance generated based on the output of the speech recognition model is provided.
- Google llc
- Zhifeng Chen of Sunnyvale CA (US)
- Bo Li of Santa Clara CA (US)
- Eugene Weinstein of New York NY (US)
- Yonghui Wu of Fremont CA (US)
- Pedro J. Moreno Mengibar of Jersey City NJ (US)
- Ron J. Weiss of New York NY (US)
- Khe Chai Sim of Dublin CA (US)
- Tara N. Sainath of Jersey City NJ (US)
- Patrick An Phu Nguyen of Palo Alto CA (US)
- G10L15/00
- G10L15/07
- G10L15/16