18058104. SPEECH DENOISING NETWORKS USING SPEECH AND NOISE MODELING simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

SPEECH DENOISING NETWORKS USING SPEECH AND NOISE MODELING

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Chou-Chang Yang of San Jose CA (US)

Ching-Hua Lee of Mountain View CA (US)

Rakshith Sharma Srinivasa of Sunnyvale CA (US)

Yashas Malur Saidutta of San Jose CA (US)

Yilin Shen of San Jose CA (US)

Hongxia Jin of San Jose CA (US)

SPEECH DENOISING NETWORKS USING SPEECH AND NOISE MODELING - A simplified explanation of the abstract

This abstract first appeared for US patent application 18058104 titled 'SPEECH DENOISING NETWORKS USING SPEECH AND NOISE MODELING

Simplified Explanation

The patent application describes a method for improving speech signals in noisy environments. Here is a simplified explanation of the abstract:

  • The method involves using a processing device to obtain and analyze noisy speech signals.
  • Acoustic features are extracted from the noisy speech signals.
  • A predicted speech mask and a predicted noise mask are received from separate prediction models, based on different subsets of acoustic features.
  • Predicted speech features and predicted noise features are determined using the predicted masks.
  • These predicted features are provided to a filtering mask prediction model.
  • A clean speech signal is generated using the predicted filtering mask output by the filtering mask prediction model.

Potential applications of this technology:

  • Enhancing speech quality in noisy environments, such as in telecommunication systems, voice assistants, and hearing aids.
  • Improving speech recognition accuracy in noisy conditions, benefiting applications like voice-controlled devices and transcription services.

Problems solved by this technology:

  • Reducing background noise interference in speech signals, leading to clearer and more intelligible speech.
  • Minimizing the impact of noise on speech recognition systems, improving their performance and accuracy.

Benefits of this technology:

  • Improved speech quality and intelligibility in noisy environments, enhancing communication experiences.
  • Enhanced accuracy and reliability of speech recognition systems, leading to better user interactions and productivity.


Original Abstract Submitted

A method includes obtaining, using at least one processing device, noisy speech signals and extracting, using the at least one processing device, acoustic features from the noisy speech signals. The method also includes receiving, using the at least one processing device, a predicted speech mask from a speech mask prediction model based on a first acoustic feature subset and receiving, using the at least one processing device, a predicted noise mask from a noise mask prediction model based on a second acoustic feature subset. The method further includes providing, using the at least one processing device, predicted speech features determined using the predicted speech mask and predicted noise features determined using the predicted noise mask to a filtering mask prediction model. In addition, the method includes generating, using the at least one processing device, a clean speech signal using a predicted filtering mask output by the filtering mask prediction model.