18619608. Cascade Architecture for Noise-Robust Keyword Spotting simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

Cascade Architecture for Noise-Robust Keyword Spotting

Organization Name

GOOGLE LLC

Inventor(s)

Yiteng Huang of Mountain View CA (US)

Alexander H. Gruenstein of Mountain View CA (US)

Cascade Architecture for Noise-Robust Keyword Spotting - A simplified explanation of the abstract

This abstract first appeared for US patent application 18619608 titled 'Cascade Architecture for Noise-Robust Keyword Spotting

Simplified Explanation

The method described in the patent application involves processing multi-channel audio captured by an array of microphones on a user device to detect a specific hotword. If the hotword is detected, the audio data is cleaned and analyzed further to confirm the presence of the hotword.

Key Features and Innovation

  • Receiving and processing multi-channel audio data captured by an array of microphones.
  • Using a two-stage hotword detection process to identify a specific hotword.
  • Cleaning the audio data using a noise cleaning algorithm to enhance the accuracy of hotword detection.

Potential Applications

This technology can be used in voice-controlled devices, smart speakers, and other audio processing applications where accurate hotword detection is essential.

Problems Solved

This technology addresses the challenge of accurately detecting specific hotwords in noisy audio environments, improving the overall user experience with voice-controlled devices.

Benefits

  • Enhanced accuracy in hotword detection.
  • Improved performance in noisy audio environments.
  • Seamless integration into various audio processing applications.

Commercial Applications

  • Voice-controlled devices and smart speakers.
  • Audio transcription and analysis software.
  • Security systems with voice recognition capabilities.

Prior Art

Prior research in the field of audio processing and hotword detection can provide valuable insights into the development and implementation of this technology.

Frequently Updated Research

Researchers are continually exploring new algorithms and techniques to improve hotword detection accuracy and efficiency in various audio processing applications.

Questions about Hotword Detection

How does the two-stage hotword detection process improve accuracy?

The two-stage process allows for initial detection of potential hotwords followed by a more detailed analysis to confirm the presence of the specific hotword, enhancing overall accuracy.

What are the potential challenges in implementing this technology in real-world applications?

Implementing this technology in real-world applications may require optimizing the algorithms for different audio environments and ensuring compatibility with various devices and systems.


Original Abstract Submitted

A method includes receiving, at a first processor of a user device, streaming multi-channel audio captured by an array of microphones, each channel including respective audio features. For each channel, the method also includes processing, by the first processor, using a first stage hotword detector, the respective audio features to determine whether a hotword is detected. When the first stage hotword detector detects the hotword, the method also includes the first processor providing chomped raw audio data to a second processor that processes, using a first noise cleaning algorithm, the chomped raw audio data to generate a clean monophonic audio chomp. The method also includes processing, by the second processor using a second stage hotword detector, the clean monophonic audio chomp to detect the hotword.