20240054997. AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE

Organization Name

GOOGLE LLC

Inventor(s)

Pu-sen Chao of Los Altos CA (US)

Diego Melendo Casado of Mountain View CA (US)

Ignacio Lopez Moreno of New York NY (US)

AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240054997 titled 'AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE

Simplified Explanation

The patent application is about determining the language for speech recognition of a spoken utterance received through an automated assistant interface. The goal is to enable multilingual interaction with the automated assistant without requiring the user to explicitly specify the language for each interaction.

  • The system determines a user profile based on audio data capturing the spoken utterance.
  • The user profile includes language(s) assigned to the user, along with corresponding probabilities.
  • The system uses the language(s) assigned to the user profile to determine the language for speech recognition of the spoken utterance.
  • Some implementations select a subset of languages assigned to the user profile for speech recognition.
  • Other implementations perform speech recognition in each of the assigned languages and select the most appropriate one for generating responsive content.

Potential applications of this technology:

  • Multilingual automated assistants: This technology can be used to develop automated assistants that can understand and respond to spoken utterances in multiple languages, enhancing their usability for users who speak different languages.
  • Language learning tools: The system can be utilized in language learning applications to provide speech recognition and feedback in multiple languages, helping learners practice their pronunciation and fluency.
  • Translation services: By accurately determining the language of a spoken utterance, the technology can be integrated into translation services to provide more accurate translations.

Problems solved by this technology:

  • Language identification: The system solves the problem of automatically identifying the language of a spoken utterance without relying on explicit user input, making the interaction with automated assistants more seamless and user-friendly.
  • Multilingual interaction: This technology solves the challenge of enabling multilingual interaction with automated assistants, allowing users to communicate in their preferred language without the need for language switching or explicit language selection.

Benefits of this technology:

  • Improved user experience: Users can interact with automated assistants in their preferred language without the need for explicit language designation, enhancing the overall user experience.
  • Efficient multilingual support: The system can efficiently handle multilingual interactions by automatically determining the language for speech recognition, reducing the burden on users to switch languages or specify language preferences.
  • Enhanced accessibility: By supporting multiple languages, the technology improves accessibility for users who are more comfortable communicating in languages other than the default language of the automated assistant.


Original Abstract Submitted

determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user. some implementations perform speech recognition in each of multiple languages assigned to the user profile, and utilize criteria to select only one of the speech recognitions as appropriate for generating and providing content that is responsive to the spoken utterance.