Jump to content

18815537. TWO-PASS END TO END SPEECH RECOGNITION (GOOGLE LLC)

From WikiPatents

TWO-PASS END TO END SPEECH RECOGNITION

Organization Name

GOOGLE LLC

Inventor(s)

Tara N. Sainath of Jersey City NJ (US)

Yanzhang He of Palo Alto CA (US)

Bo Li of Fremont CA (US)

Arun Narayanan of Milpitas CA (US)

Ruoming Pang of New York NY (US)

Antoine Jean Bruguier of Milpitas CA (US)

Shuo-yiin Chang of Mountain View CA (US)

Wei Li of Fremont CA (US)

TWO-PASS END TO END SPEECH RECOGNITION

This abstract first appeared for US patent application 18815537 titled 'TWO-PASS END TO END SPEECH RECOGNITION



Original Abstract Submitted

Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.

Cookies help us deliver our services. By using our services, you agree to our use of cookies.