18334752. SPEECH PROCESSING USING MACHINE LEARNING FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS (NVIDIA Corporation)

From WikiPatents
Revision as of 07:27, 19 December 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

SPEECH PROCESSING USING MACHINE LEARNING FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

Organization Name

NVIDIA Corporation

Inventor(s)

Xianchao Wu of Tokyo (JP)

Peiying Ruan of Kanazawa (JP)

Yi Dong of Lexington MA (US)

SPEECH PROCESSING USING MACHINE LEARNING FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

This abstract first appeared for US patent application 18334752 titled 'SPEECH PROCESSING USING MACHINE LEARNING FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS



Original Abstract Submitted

In various examples, techniques for accelerating inference in text and speech processing for conversational AI systems and applications is described herein. Systems and methods are disclosed that use one or more techniques, such as token merging, in order to reduce a number of tokens processed by one or more machine learning models. For instance, the machine learning model(s) may process text and, based at least on the processing, generate scores (e.g., attention scores) indicating relationships between tokens associated with the text. The machine learning model(s) may then use the scores to merge at least one pair of the tokens. As described herein, the merging may reduce the overall number of tokens associated with the text while still maintaining the same semantic meaning as the original text. Next, the machine learning model(s) may process the reduce number of tokens in order to determine an output associated with the text.