GOOGLE LLC (20240420686). SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS

From WikiPatents
Jump to navigation Jump to search

SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS

Organization Name

GOOGLE LLC

Inventor(s)

Rohit Prakash Prabhavalkar of Santa Clara CA (US)

Zhifeng Chen of Sunnyvale CA (US)

Bo Li of Fremont CA (US)

Chung-Cheng Chiu of Sunnyvale CA (US)

Kanury Kanishka Rao of Santa Clara CA (US)

Yonghui Wu of Fremont CA (US)

Ron J. Weiss of New York NY (US)

Navdeep Jaitly of Mountain View CA (US)

Michiel A. U. Bacchiani of Summit NJ (US)

Tara N. Sainath of Jersey City NJ (US)

Jan Kazimierz Chorowski of POLAND (PL)

Anjuli Patricia Kannan of Berkeley CA (US)

Ekaterina Gonina of Sunnyvale CA (US)

Patrick An Phu Nguyen of Palo Alto CA (US)

SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS

This abstract first appeared for US patent application 20240420686 titled 'SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS



Original Abstract Submitted

a method for performing speech recognition using sequence-to-sequence models includes receiving audio data for an utterance and providing features indicative of acoustic characteristics of the utterance as input to an encoder. the method also includes processing an output of the encoder using an attender to generate a context vector, generating speech recognition scores using the context vector and a decoder trained using a training process, and generating a transcription for the utterance using word elements selected based on the speech recognition scores. the transcription is provided as an output of the asr system.