US Patent Application 17757122. TRAINED GENERATIVE MODEL SPEECH CODING simplified abstract

From WikiPatents
Jump to navigation Jump to search

TRAINED GENERATIVE MODEL SPEECH CODING

Organization Name

Google LLC


Inventor(s)

Willem Bastiaan Kleijn of Eastborne Wellington (NZ)

Andrew Storus of San Francisco CA (US)

TRAINED GENERATIVE MODEL SPEECH CODING - A simplified explanation of the abstract

This abstract first appeared for US patent application 17757122 titled 'TRAINED GENERATIVE MODEL SPEECH CODING

Simplified Explanation

- The patent application describes a method for training a machine learning model to generate high-quality audio from a low bitrate input. - The method involves receiving sampled audio data of utterances and using it to train the machine learning model. - The training process includes reducing the impact of low-probability distortion events in the audio data on the trained model. - This is achieved by including a term in the objective function of the model that encourages low-variance predictive distributions of the next audio sample based on previous samples. - The goal is to improve the fidelity of the generated audio stream while using a low bitrate input.


Original Abstract Submitted

A method includes receiving sampled audio data corresponding to utterances and training a machine learning (ML) model, using the sampled audio data, to generate a high-fidelity audio stream from a low bitrate input bitstream. The training of the ML model includes de-emphasizing the influence of low-probability distortion events in the sampled audio data on the trained ML model, where the de-emphasizing of the distortion events is achieved by the inclusion of a term in an objective function of the ML model, which term encourages low-variance predictive distributions of a next sample in the sampled audio data, based on previous samples of the audio data.