Google LLC (20240221750). KEY PHRASE SPOTTING simplified abstract

From WikiPatents
Jump to navigation Jump to search

KEY PHRASE SPOTTING

Organization Name

Google LLC

Inventor(s)

Wei Li of Mountain View CA (US)

Rohit Prakash Prabhavalkar of Santa Clara CA (US)

Kanury Kanishka Rao of Santa Clara CA (US)

Yanzhang He of Mountain View CA (US)

Ian C. Mcgraw of Menlo Park CA (US)

Anton Bakhtin of New York NY (US)

KEY PHRASE SPOTTING - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240221750 titled 'KEY PHRASE SPOTTING

The patent application describes methods, systems, and apparatus for detecting utterances of a key phrase in an audio signal using an attention mechanism and neural network layers.

  • Receiving an audio signal encoding one or more utterances.
  • Generating an attention output using an attention mechanism based on encodings from neural network layers.
  • Outputting whether the audio signal likely encodes the key phrase.
  • Providing the output indicating the likelihood of the key phrase in the audio signal.
      1. Potential Applications:

This technology can be used in speech recognition systems, virtual assistants, security systems, and audio transcription services.

      1. Problems Solved:

This technology addresses the challenge of accurately detecting specific key phrases in audio signals amidst background noise and variations in speech patterns.

      1. Benefits:

- Improved accuracy in detecting key phrases in audio signals. - Enhanced performance of speech recognition systems. - Increased efficiency in audio transcription services.

      1. Commercial Applications:

The technology can be utilized in smart speakers, call center analytics, voice-controlled devices, and security systems for keyword detection.

      1. Prior Art:

Researchers can explore prior art related to attention mechanisms in speech recognition systems and neural network applications in audio signal processing.

      1. Frequently Updated Research:

Stay updated on advancements in neural network architectures for audio signal processing and attention mechanisms in speech recognition systems.

        1. Questions about Key Phrase Detection:

1. How does the attention mechanism improve the accuracy of key phrase detection in audio signals? 2. What are the potential limitations of using neural network layers for key phrase spotting in audio signals?


Original Abstract Submitted

methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. one of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.