20250218433. Automatic Speech Recognitio (NVIDIA)
AUTOMATIC SPEECH RECOGNITION WITH TARGET WORD SPOTTING
Abstract: disclosed are apparatuses, systems, and techniques that may use machine learning for implementing automatic speech recognition (asr) facilitated with a search for target words. the techniques include applying an asr model to audio data to generate an asr output representative of a likelihood that the audio data comprises one or more spoken speech units (sus), generating, using the asr output, a first score characterizing a likelihood that the audio data comprises a first word, wherein the first word comprises a dictionary word, generating, using the asr output, a second score characterizing a likelihood that the audio data comprises a second word, wherein the second word comprises a word of a plurality of target words, wherein the plurality of target words is identified based at least on a context of the audio data, and predicting, using the first score and the second score, a spoken word associated with the audio data.
Inventor(s): Andrei Andrusenko, Aleksandr Laptev, Vladimir Bataev, Vitaly Lavrukhin, Boris Ginsburg
CPC Classification: G10L15/183 (SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING)
Search for rejections for patent application number 20250218433