20240046918. MEANING INFERENCE FROM SPEECH AUDIO simplified abstract (SOUNDHOUND AI IP, LLC)

From WikiPatents
Jump to navigation Jump to search

MEANING INFERENCE FROM SPEECH AUDIO

Organization Name

SOUNDHOUND AI IP, LLC

Inventor(s)

Sudharsan Krishnaswamy of San Jose CA (US)

Maisy Wieman of Boulder CO (US)

Jonah Probell of Alviso CA (US)

MEANING INFERENCE FROM SPEECH AUDIO - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240046918 titled 'MEANING INFERENCE FROM SPEECH AUDIO

Simplified Explanation

The patent application describes a system and method for invoking actions by a virtual assistant based on various probabilities inferred from audio input. These probabilities include the probability of an intent, the probability of a domain, and the probability of variable values. The action is invoked when the intent probability exceeds a threshold, or when the domain probability, variable value probability, end of utterance detection, or a specific amount of time elapsed condition is met. The intent probability can increase when the audio includes speech of words with the same meaning in multiple languages. The action invocation can also be conditional on the variable value exceeding its threshold within a certain period of time after the intent probability exceeds its threshold.

  • The system and method enable a virtual assistant to perform actions based on inferred probabilities from audio input.
  • Actions are invoked when the probability of an intent, domain, or variable value exceeds a threshold.
  • The system considers factors such as end of utterance detection and elapsed time to determine when to invoke the action.
  • The intent probability can increase when the audio includes speech of words with the same meaning in multiple languages.
  • The action invocation can be conditional on the variable value exceeding its threshold within a certain period of time after the intent probability exceeds its threshold.

Potential Applications

  • Voice-controlled virtual assistants in various domains such as smart homes, customer service, and personal assistants.
  • Multilingual virtual assistants that can understand and respond to speech in multiple languages.
  • Automated systems for performing tasks based on user intents and preferences.

Problems Solved

  • Improved accuracy and efficiency in invoking actions by virtual assistants.
  • Enhanced understanding of user intents and preferences through probabilistic inference.
  • Addressing the challenge of multilingual speech recognition and understanding.

Benefits

  • Increased user satisfaction by accurately understanding and responding to user intents.
  • Time-saving and convenience through automated actions performed by virtual assistants.
  • Improved user experience with multilingual support and understanding.


Original Abstract Submitted

a system and method invoke virtual assistant action, which may comprise an argument. from audio, a probability of an intent is inferred. a probability of a domain and a plurality of variable values may also be inferred. invoking the action is in response to the intent probability exceeding a threshold. invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. the intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.