20240054998. SCALABLE DYNAMIC CLASS LANGUAGE MODELING simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

SCALABLE DYNAMIC CLASS LANGUAGE MODELING

Organization Name

GOOGLE LLC

Inventor(s)

Justin Max Scheiner of New York NY (US)

Petar Aleksic of Jersey City NJ (US)

SCALABLE DYNAMIC CLASS LANGUAGE MODELING - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240054998 titled 'SCALABLE DYNAMIC CLASS LANGUAGE MODELING

Simplified Explanation

The patent application describes a system and method for adapting speech recognition for individual voice queries using class-based language models. Here are the key points:

  • The system receives a voice query from a user, along with context data associated with the user.
  • Class models are generated based on the context data, which identify a set of terms and assign a class to each term.
  • A language model with a residual unigram is accessed and processed for each class, inserting a class symbol at each instance of the residual unigram.
  • The modified language model is then used to generate a transcription of the user's utterance.

Potential Applications:

  • Voice assistants: The technology can be applied to improve the accuracy and adaptability of voice assistants like Siri or Alexa, making them better at understanding individual users' voice queries.
  • Call centers: The system can be used in call centers to enhance speech recognition during customer interactions, leading to improved customer service and efficiency.

Problems Solved:

  • Contextual understanding: By incorporating context data and generating class models, the system addresses the challenge of understanding voice queries in specific contexts, improving the accuracy of speech recognition.
  • Individual adaptation: The technology solves the problem of adapting speech recognition systems to individual users, allowing for personalized voice interactions.

Benefits:

  • Improved accuracy: By dynamically adapting the language model based on context and class-based models, the system can provide more accurate transcriptions of user utterances.
  • Personalization: The technology enables personalized voice recognition, making voice assistants or other speech recognition systems better suited to individual users' preferences and speech patterns.
  • Enhanced user experience: With improved accuracy and personalization, users can have a more seamless and efficient interaction with voice-based systems, leading to a better overall user experience.


Original Abstract Submitted

this document generally describes systems and methods for dynamically adapting speech recognition for individual voice queries of a user using class-based language models. the method may include receiving a voice query from a user that includes audio data corresponding to an utterance of the user, and context data associated with the user. one or more class models are then generated that collectively identify a first set of terms determined based on the context data, and a respective class to which the respective term is assigned for each respective term in the first set of terms. a language model that includes a residual unigram may then be accessed and processed for each respective class to insert a respective class symbol at each instance of the residual unigram that occurs within the language model. a transcription of the utterance of the user is then generated using the modified language model.