20240054999. CONTEXT-AWARE FALSE TRIGGER MITIGATION FOR AUTOMATIC SPEECH RECOGNITION (ASR) SYSTEMS OR OTHER SYSTEMS simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

CONTEXT-AWARE FALSE TRIGGER MITIGATION FOR AUTOMATIC SPEECH RECOGNITION (ASR) SYSTEMS OR OTHER SYSTEMS

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Cindy Sushen Tseng of Santa Clara CA (US)

Srinivasa Rao Ponakala of Sunnyvale CA (US)

Myungjong Kim of Milpitas CA (US)

Taeyeon Ki of Milpitas CA (US)

Vijendra Raj Apsingekar of San Jose CA (US)

CONTEXT-AWARE FALSE TRIGGER MITIGATION FOR AUTOMATIC SPEECH RECOGNITION (ASR) SYSTEMS OR OTHER SYSTEMS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240054999 titled 'CONTEXT-AWARE FALSE TRIGGER MITIGATION FOR AUTOMATIC SPEECH RECOGNITION (ASR) SYSTEMS OR OTHER SYSTEMS

Simplified Explanation

The patent application describes a method for determining the probability of an audio input containing a false trigger for automatic speech recognition. The method involves obtaining an audio input and the location associated with an electronic device. It also includes generating an audio embedding associated with the audio input and determining the difference between this audio embedding and that of a known user. Additionally, the method determines the difference between the location of the electronic device and the known location of the user. Using a false trigger mitigation system, the method generates a probability of the audio input including a false trigger based on the audio input, the differences in audio embeddings, and the differences in locations. Finally, the method determines whether to perform automatic speech recognition based on this probability.

  • The method involves obtaining an audio input and its associated location.
  • An audio embedding is generated for the audio input.
  • The method compares the audio embedding of the input with that of a known user.
  • The method also compares the location of the electronic device with the known location of the user.
  • A false trigger mitigation system is used to generate a probability of the audio input containing a false trigger.
  • The probability is based on the audio input, the differences in audio embeddings, and the differences in locations.
  • The method determines whether to perform automatic speech recognition based on this probability.

Potential applications of this technology:

  • Enhancing automatic speech recognition systems by reducing false triggers.
  • Improving the accuracy and reliability of voice-controlled devices.
  • Enhancing security systems by verifying the identity of the user based on audio embeddings and location.

Problems solved by this technology:

  • False triggers in automatic speech recognition systems can lead to inaccurate and unintended actions.
  • Verifying the identity of the user based on audio embeddings and location can help prevent unauthorized access to devices or systems.

Benefits of this technology:

  • Increased accuracy and reliability of automatic speech recognition systems.
  • Improved user experience with voice-controlled devices.
  • Enhanced security and protection against unauthorized access.


Original Abstract Submitted

a method includes obtaining an audio input and a location associated with an electronic device. the method also includes generating an audio embedding associated with the audio input. the method further includes determining a first difference between the audio embedding associated with the audio input and an audio embedding associated with a known user. the method also includes determining a second difference between the location associated with the electronic device and a known location associated with the known user. the method further includes generating, using a false trigger mitigation (ftm) system, a probability of the audio input including a false trigger for automatic speech recognition based on the audio input, the first difference, and the second difference. in addition, the method includes determining whether to perform automatic speech recognition based on the probability.