18486145. SCALABLE DYNAMIC CLASS LANGUAGE MODELING simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

SCALABLE DYNAMIC CLASS LANGUAGE MODELING

Organization Name

GOOGLE LLC

Inventor(s)

Justin Max Scheiner of New York NY (US)

Petar Aleksic of Jersey City NJ (US)

SCALABLE DYNAMIC CLASS LANGUAGE MODELING - A simplified explanation of the abstract

This abstract first appeared for US patent application 18486145 titled 'SCALABLE DYNAMIC CLASS LANGUAGE MODELING

Simplified Explanation

- Systems and methods for dynamically adapting speech recognition for individual voice queries of a user using class-based language models. - Receiving a voice query from a user with audio data and context data. - Generating one or more class models to identify terms based on context data and assign a class to each term. - Accessing and processing a language model with a residual unigram for each class to insert a class symbol. - Generating a transcription of the user's utterance using the modified language model.

Potential Applications

- Improved speech recognition systems for personalized user experiences. - Enhanced voice-controlled devices and virtual assistants. - Better accuracy in transcribing user voice queries.

Problems Solved

- Addressing the challenge of accurately recognizing individual voice queries in various contexts. - Improving the performance of speech recognition systems for diverse users. - Enhancing the user experience by adapting to specific user preferences and contexts.

Benefits

- Increased accuracy in speech recognition for personalized user interactions. - Enhanced user satisfaction with voice-controlled devices and services. - Improved efficiency in transcribing user voice queries.


Original Abstract Submitted

This document generally describes systems and methods for dynamically adapting speech recognition for individual voice queries of a user using class-based language models. The method may include receiving a voice query from a user that includes audio data corresponding to an utterance of the user, and context data associated with the user. One or more class models are then generated that collectively identify a first set of terms determined based on the context data, and a respective class to which the respective term is assigned for each respective term in the first set of terms. A language model that includes a residual unigram may then be accessed and processed for each respective class to insert a respective class symbol at each instance of the residual unigram that occurs within the language model. A transcription of the utterance of the user is then generated using the modified language model.