Amazon technologies, inc. (20240321264). AUTOMATIC SPEECH RECOGNITION simplified abstract

From WikiPatents
Jump to navigation Jump to search

AUTOMATIC SPEECH RECOGNITION

Organization Name

amazon technologies, inc.

Inventor(s)

Jing Liu of Pittsburgh PA (US)

Feng-Ju Chang of Pittsburgh PA (US)

Athanasios Mouchtaris of Pittsburgh PA (US)

Martin Radfar of North York (CA)

Maurizio Omologo of Altopiano della Vigolana (IT)

Siegfried Kunzmann of Heidelberg (DE)

AUTOMATIC SPEECH RECOGNITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240321264 titled 'AUTOMATIC SPEECH RECOGNITION

Simplified Explanation: The patent application describes techniques for automatic speech recognition (ASR) that integrate contextual information from user profiles to predict spoken input.

  • The ASR component uses personalized words from user profiles, such as contact names and device names, to determine word embedding data.
  • Attention is applied to audio frames relevant to personalized words based on processing audio encoding data and word embedding data.

Key Features and Innovation:

  • Integration of user profile data into ASR for predicting spoken input.
  • Use of personalized words to determine word embedding data.
  • Application of attention to relevant audio frames based on personalized words.

Potential Applications: This technology can be used in various applications such as virtual assistants, transcription services, and voice-controlled devices.

Problems Solved:

  • Enhances accuracy of ASR by incorporating user-specific information.
  • Improves user experience by predicting spoken input more effectively.

Benefits:

  • Increased accuracy in speech recognition.
  • Personalized user experience.
  • Enhanced efficiency in processing spoken input.

Commercial Applications: Potential commercial uses include speech-to-text services, smart home devices, and customer service automation.

Questions about Automatic Speech Recognition: 1. How does the integration of user profile data improve the accuracy of ASR? 2. What are the potential privacy concerns associated with using personalized words in ASR technology?

Frequently Updated Research: Stay updated on advancements in ASR technology, including improvements in user profile integration and attention mechanisms for audio processing.


Original Abstract Submitted

techniques for performing automatic speech recognition (asr) are described. in some embodiments, an asr component integrates contextual information from user profile data into audio encoding data to predict a token(s) corresponding to a spoken input. the user profile data may include personalized words, such as, contact names, device names, etc. the asr component determines word embedding data using the personalized words. the asr component is configured to apply attention to audio frames that are relevant to the personalized words based on processing the audio encoding data and the word embedding data.