Samsung electronics co., ltd. (20240331715). SYSTEM AND METHOD FOR MASK-BASED NEURAL BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT simplified abstract

From WikiPatents
Revision as of 15:50, 4 October 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

SYSTEM AND METHOD FOR MASK-BASED NEURAL BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT

Organization Name

samsung electronics co., ltd.

Inventor(s)

Ching-Hua Lee of Mountain View CA (US)

Chou-Chang Yang of San Jose CA (US)

Yilin Shen of San Jose CA (US)

Hongxia Jin of San Jose CA (US)

SYSTEM AND METHOD FOR MASK-BASED NEURAL BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240331715 titled 'SYSTEM AND METHOD FOR MASK-BASED NEURAL BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT

The method described in the patent application involves processing noisy audio signals from multiple input devices to isolate clean speech audio.

  • Receiving noisy audio signals from various audio input devices during a specific time window.
  • Generating a noisy time-frequency representation based on the noisy audio signals.
  • Providing the noisy time-frequency representation to a mask estimation model to predict a clean time-frequency representation of clean speech audio.
  • Determining beamforming filter weights based on the mask output by the model.
  • Applying the beamforming filter weights to the noisy time-frequency representation to extract clean speech audio.
  • Outputting the clean speech audio for further use.

Potential Applications: - Speech enhancement in noisy environments - Audio signal processing in telecommunication systems - Noise reduction in audio recording devices

Problems Solved: - Eliminating background noise from audio signals - Enhancing speech clarity in noisy conditions

Benefits: - Improved audio quality in challenging environments - Enhanced speech intelligibility for better communication - Increased accuracy in audio processing tasks

Commercial Applications: Title: Advanced Speech Enhancement Technology for Telecommunication Systems This technology can be used in telecommunication systems to improve speech quality in noisy environments, leading to better customer communication and user experience. It can also be integrated into audio recording devices for professional use in studios or live events.

Prior Art: Prior research in beamforming techniques for audio signal processing Studies on mask estimation models for speech enhancement

Frequently Updated Research: Ongoing advancements in deep learning models for audio signal processing Research on real-time applications of speech enhancement technologies

Questions about Speech Enhancement Technology: 1. How does this technology compare to traditional noise reduction methods? 2. What are the potential limitations of this method in real-world applications?


Original Abstract Submitted

a method includes receiving, during a first time window, a set of noisy audio signals from a plurality of audio input devices. the method also includes generating a noisy time-frequency representation based on the set of noisy audio signals. the method further includes providing the noisy time-frequency representation as an input to a mask estimation model trained to output a mask used to predict a clean time-frequency representation of clean speech audio from the noisy time-frequency representation. the method also includes determining beamforming filter weights based on the mask. the method further includes applying the beamforming filter weights to the noisy time-frequency representation to isolate the clean speech audio from the set of noisy audio signals. in addition, the method includes outputting the clean speech audio.