Joint Acoustic Echo Cancellation (AEC) and Personalized Noise Suppression (PNS)

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Sefik Emre Eskimez of Bellevue WA (US)

Takuya Yoshioka of Bellevue WA (US)

Huaming Wang of Clyde Hill WA (US)

Alex Chenzhi Ju of Seattle WA (US)

Min Tang of Redmond WA (US)

[[:Category:Tanel P�rnamaa of Tallinn (EE)|Tanel P�rnamaa of Tallinn (EE)]][[Category:Tanel P�rnamaa of Tallinn (EE)]]

Joint Acoustic Echo Cancellation (AEC) and Personalized Noise Suppression (PNS) - A simplified explanation of the abstract

This abstract first appeared for US patent application 18172017 titled 'Joint Acoustic Echo Cancellation (AEC) and Personalized Noise Suppression (PNS)

Simplified Explanation

The patent application describes a data processing system that uses machine learning to perform personalized noise suppression and acoustic echo cancellation in online communication sessions.

The system receives signals from two computing devices participating in an online communication session.
The near-end signal includes speech from a target speaker, an interfering speaker, and an echo signal.
The system provides the signals and an indication of the target speaker as input to a machine learning model.
The model is trained to analyze the signals and remove speech from interfering speakers and echoes, outputting the speech of the target speaker.

Potential Applications

This technology could be applied in:

Video conferencing platforms
Virtual classrooms
Telemedicine services

Problems Solved

Eliminating background noise and echoes in online communication
Improving speech clarity and overall audio quality

Benefits

Enhanced user experience in online communication
Improved efficiency in virtual meetings and collaborations

Potential Commercial Applications

Integration into video conferencing software
Licensing the technology to communication service providers

Possible Prior Art

One possible prior art for this technology could be noise cancellation algorithms used in audio processing software.

Unanswered Questions

How does the system differentiate between the target speaker and interfering speakers in real-time communication?

The system likely uses voice recognition algorithms to identify and isolate the speech of the target speaker.

What is the computational overhead of running the machine learning model in real-time communication sessions?

The system may require significant processing power to analyze and process audio signals in real-time, potentially impacting the performance of the communication session.

Original Abstract Submitted

A data processing system implements receiving a far-end signal associated with a first computing device participating in an online communication session and receiving a near-end signal associated with a second computing device participating in the online communication session. The near-end signal includes speech of a target speaker, a first interfering speaker, and an echo signal. The system further implements providing the far-end signal, the near-end signal, and an indication of the target speaker as an input to a machine learning model. The machine learning model trained to analyze the far-end signal and the near-end signal to perform personalized noise suppression (PNS) to remove speech from one or more interfering speakers and acoustic echo cancellation (AEC) to remove echoes. The model is trained to output an audio signal comprising speech of the target speaker. The system obtains the audio signal comprising the speech of the target speaker from the model.

18172017. Joint Acoustic Echo Cancellation (AEC) and Personalized Noise Suppression (PNS) simplified abstract (Microsoft Technology Licensing, LLC)

Contents

Joint Acoustic Echo Cancellation (AEC) and Personalized Noise Suppression (PNS)

Organization Name

Inventor(s)

Joint Acoustic Echo Cancellation (AEC) and Personalized Noise Suppression (PNS) - A simplified explanation of the abstract

Simplified Explanation

Potential Applications

Problems Solved

Benefits

Potential Commercial Applications

Possible Prior Art

Unanswered Questions

How does the system differentiate between the target speaker and interfering speakers in real-time communication?

What is the computational overhead of running the machine learning model in real-time communication sessions?

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools