Google LLC (20250022477). COMPRESSING AUDIO WAVEFORMS USING A STRUCTURED LATENT SPACE
Contents
COMPRESSING AUDIO WAVEFORMS USING A STRUCTURED LATENT SPACE
Organization Name
Inventor(s)
Félix De Chaumont Quitry of Zürich CH
Marco Tagliasacchi of Kilchberg CH
COMPRESSING AUDIO WAVEFORMS USING A STRUCTURED LATENT SPACE
This abstract first appeared for US patent application 20250022477 titled 'COMPRESSING AUDIO WAVEFORMS USING A STRUCTURED LATENT SPACE
Original Abstract Submitted
methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network and a decoder neural network. in one aspect, a method includes obtaining a first initial audio waveform and a first noisy audio waveform, obtaining a second initial audio waveform and a second noisy audio waveform, processing the first noisy audio waveform and the second noisy audio waveform using an encoder neural network, generating a blended embedding by concatenating: (i) clean feature dimensions from an embedding of the first noisy audio waveform, and (ii) noise feature dimensions from an embedding of the second noisy audio waveform, processing the blended embedding using a decoder neural network to generate a reconstructed audio waveform, determining gradients of an objective function; and updating parameter values of the encoder neural network and the decoder neural network using the gradients.