CONTEXT-BASED SPEECH ENHANCEMENT

Inventors

CONTEXT-BASED SPEECH ENHANCEMENT - A simplified explanation of the abstract

This abstract for appeared for patent application number 18334641 Titled 'CONTEXT-BASED SPEECH ENHANCEMENT'

Simplified Explanation

This abstract describes a device that can enhance speech by analyzing image data to detect emotions, speaker characteristics, and noise types. It uses this information to generate context data. The device then takes an input signal representing sound with speech and processes it using a multi-encoder transformer along with the context data to produce an output signal that represents an enhanced version of the original speech.

Original Abstract Submitted

A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.

US Patent Application 18334641. CONTEXT-BASED SPEECH ENHANCEMENT simplified abstract

Contents

CONTEXT-BASED SPEECH ENHANCEMENT

Inventors