18384764. DETECTING NEAR MATCHES TO A HOTWORD OR PHRASE simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

DETECTING NEAR MATCHES TO A HOTWORD OR PHRASE

Organization Name

GOOGLE LLC

Inventor(s)

Matthew Sharifi of Kilchberg (CH)

Victor Carbune of Zurich (CH)

DETECTING NEAR MATCHES TO A HOTWORD OR PHRASE - A simplified explanation of the abstract

This abstract first appeared for US patent application 18384764 titled 'DETECTING NEAR MATCHES TO A HOTWORD OR PHRASE

Simplified Explanation

The patent application describes techniques for identifying a failed hotword attempt using audio data processing.

  • Receiving first audio data
  • Processing the first audio data to generate a first predicted output
  • Determining if the first predicted output satisfies a secondary threshold but not a primary threshold
  • Receiving second audio data
  • Processing the second audio data to generate a second predicted output
  • Determining if the second predicted output satisfies the secondary threshold but not the primary threshold
  • Identifying a failed hotword attempt if both predicted outputs satisfy the secondary threshold but not the primary threshold, and if the spoken utterances meet certain temporal criteria
  • Providing a hint in response to the failed hotword attempt

---

      1. Potential Applications
  • Voice recognition systems
  • Virtual assistants
  • Smart home devices
      1. Problems Solved
  • Improving accuracy of hotword detection
  • Reducing false positives in voice command systems
      1. Benefits
  • Enhanced user experience
  • More reliable voice recognition
  • Better performance in noisy environments


Original Abstract Submitted

Techniques are described herein for identifying a failed hotword attempt. A method includes: receiving first audio data; processing the first audio data to generate a first predicted output; determining that the first predicted output satisfies a secondary threshold but does not satisfy a primary threshold; receiving second audio data; processing the second audio data to generate a second predicted output; determining that the second predicted output satisfies the secondary threshold but does not satisfy the primary threshold; in response to the first predicted output and the second predicted output satisfying the secondary threshold but not satisfying the primary threshold, and in response to the first spoken utterance and the second spoken utterance satisfying one or more temporal criteria relative to one another, identifying a failed hotword attempt; and in response to identifying the failed hotword attempt, providing a hint that is responsive to the failed hotword attempt.