Jump to content

20250218433. Automatic Speech Recognitio (NVIDIA)

From WikiPatents

AUTOMATIC SPEECH RECOGNITION WITH TARGET WORD SPOTTING

Abstract: disclosed are apparatuses, systems, and techniques that may use machine learning for implementing automatic speech recognition (asr) facilitated with a search for target words. the techniques include applying an asr model to audio data to generate an asr output representative of a likelihood that the audio data comprises one or more spoken speech units (sus), generating, using the asr output, a first score characterizing a likelihood that the audio data comprises a first word, wherein the first word comprises a dictionary word, generating, using the asr output, a second score characterizing a likelihood that the audio data comprises a second word, wherein the second word comprises a word of a plurality of target words, wherein the plurality of target words is identified based at least on a context of the audio data, and predicting, using the first score and the second score, a spoken word associated with the audio data.

Inventor(s): Andrei Andrusenko, Aleksandr Laptev, Vladimir Bataev, Vitaly Lavrukhin, Boris Ginsburg

CPC Classification: G10L15/183 (SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING)

Search for rejections for patent application number 20250218433


Cookies help us deliver our services. By using our services, you agree to our use of cookies.