Google llc (20240420687). TWO-PASS END TO END SPEECH RECOGNITION
Contents
TWO-PASS END TO END SPEECH RECOGNITION
Organization Name
Inventor(s)
Tara N. Sainath of Jersey City NJ (US)
Yanzhang He of Palo Alto CA (US)
Arun Narayanan of Milpitas CA (US)
Ruoming Pang of New York NY (US)
Antoine Jean Bruguier of Milpitas CA (US)
Shuo-yiin Chang of Mountain View CA (US)
TWO-PASS END TO END SPEECH RECOGNITION
This abstract first appeared for US patent application 20240420687 titled 'TWO-PASS END TO END SPEECH RECOGNITION
Original Abstract Submitted
two-pass automatic speech recognition (asr) models can be used to perform streaming on-device asr to generate a text representation of an utterance captured in audio data. various implementations include a first-pass portion of the asr model used to generate streaming candidate recognition(s) of an utterance captured in audio data. for example, the first-pass portion can include a recurrent neural network transformer (rnn-t) decoder. various implementations include a second-pass portion of the asr model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. for example, the second-pass portion can include a listen attend spell (las) decoder. various implementations include a shared encoder shared between the rnn-t decoder and the las decoder.
- Google llc
- Tara N. Sainath of Jersey City NJ (US)
- Yanzhang He of Palo Alto CA (US)
- Bo Li of Fremont CA (US)
- Arun Narayanan of Milpitas CA (US)
- Ruoming Pang of New York NY (US)
- Antoine Jean Bruguier of Milpitas CA (US)
- Shuo-yiin Chang of Mountain View CA (US)
- Wei Li of Fremont CA (US)
- G10L15/16
- G06N3/08
- G10L15/05
- G10L15/06
- G10L15/22
- CPC G10L15/16