GOOGLE LLC (20240242728). Cascade Architecture for Noise-Robust Keyword Spotting simplified abstract

From WikiPatents
Jump to navigation Jump to search

Cascade Architecture for Noise-Robust Keyword Spotting

Organization Name

GOOGLE LLC

Inventor(s)

Yiteng Huang of Mountain View CA (US)

Alexander H. Gruenstein of Mountain View CA (US)

Cascade Architecture for Noise-Robust Keyword Spotting - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240242728 titled 'Cascade Architecture for Noise-Robust Keyword Spotting

Simplified Explanation:

The patent application describes a method for processing multi-channel audio captured by an array of microphones on a user device to detect a specific keyword.

  • The method involves using a first stage hotword detector to analyze the audio features of each channel to determine if the keyword is present.
  • When the keyword is detected, the raw audio data is cleaned using a noise cleaning algorithm to generate a clean monophonic audio chomp.
  • A second stage hotword detector is then used to further analyze the clean audio chomp to confirm the presence of the keyword.

Key Features and Innovation:

  • Multi-channel audio processing for keyword detection.
  • Two-stage hotword detection process.
  • Noise cleaning algorithm for generating clean audio data.

Potential Applications:

  • Voice-controlled devices.
  • Speech recognition systems.
  • Security applications.

Problems Solved:

  • Efficient keyword detection in multi-channel audio.
  • Improved accuracy in identifying specific keywords.

Benefits:

  • Enhanced user experience with voice-controlled devices.
  • Increased security through accurate keyword detection.

Commercial Applications:

  • Smart home devices.
  • Virtual assistants.
  • Security systems.

Prior Art:

Prior research in multi-channel audio processing and keyword detection algorithms may provide insights into similar technologies.

Frequently Updated Research:

Stay updated on advancements in noise cleaning algorithms and hotword detection techniques for improved performance.

Questions about Multi-Channel Audio Processing: 1. How does the two-stage hotword detection process improve keyword detection accuracy? 2. What are the potential challenges in implementing multi-channel audio processing for keyword detection?


Original Abstract Submitted

a method includes receiving, at a first processor of a user device, streaming multi-channel audio captured by an array of microphones, each channel including respective audio features. for each channel, the method also includes processing, by the first processor, using a first stage hotword detector, the respective audio features to determine whether a hotword is detected. when the first stage hotword detector detects the hotword, the method also includes the first processor providing chomped raw audio data to a second processor that processes, using a first noise cleaning algorithm, the chomped raw audio data to generate a clean monophonic audio chomp. the method also includes processing, by the second processor using a second stage hotword detector, the clean monophonic audio chomp to detect the hotword.