20240021210. METHOD AND APPARATUS FOR NEURAL NETWORK BASED PROCESSING OF AUDIO USING SINUSOIDAL ACTIVATION simplified abstract (DOLBY INTERNATIONAL AB)

From WikiPatents
Jump to navigation Jump to search

METHOD AND APPARATUS FOR NEURAL NETWORK BASED PROCESSING OF AUDIO USING SINUSOIDAL ACTIVATION

Organization Name

DOLBY INTERNATIONAL AB

Inventor(s)

Arijit Biswas of Schwaig bei Nuernberg (DE)

METHOD AND APPARATUS FOR NEURAL NETWORK BASED PROCESSING OF AUDIO USING SINUSOIDAL ACTIVATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240021210 titled 'METHOD AND APPARATUS FOR NEURAL NETWORK BASED PROCESSING OF AUDIO USING SINUSOIDAL ACTIVATION

Simplified Explanation

The abstract describes a method of processing an audio signal using a deep-learning-based generator. The method involves inputting the audio signal into the generator, mapping a time segment of the audio signal to a latent feature space representation using an encoder stage, upsampling the latent feature space representation using a decoder stage, and obtaining a processed audio signal as an output from the decoder stage.

  • The method uses a deep-learning-based generator to process audio signals.
  • The encoder stage maps a time segment of the audio signal to a latent feature space representation.
  • The decoder stage upsamples the latent feature space representation.
  • At least one layer of the decoder stage applies sinusoidal activation.
  • The output from the decoder stage is a processed audio signal.

Potential applications of this technology:

  • Audio signal processing and enhancement
  • Speech recognition and synthesis
  • Music production and remixing
  • Noise reduction and audio restoration

Problems solved by this technology:

  • Improving the quality and clarity of audio signals
  • Enhancing the intelligibility of speech in noisy environments
  • Removing unwanted noise and artifacts from audio recordings

Benefits of this technology:

  • Improved audio signal processing capabilities
  • Enhanced speech recognition and synthesis accuracy
  • Increased flexibility and creativity in music production
  • Improved audio quality in various applications


Original Abstract Submitted

described herein is a method of processing an audio signal using a deep-learning-based generator, wherein the method includes the steps of: (a) inputting the audio signal into the generator for processing the audio signal; (b) mapping a time segment of the audio signal to a latent feature space representation, using an encoder stage of the generator; (c) upsampling the latent feature space representation using a decoder stage of the generator, wherein at least one layer of the decoder stage applies sinusoidal activation; and (d) obtaining, as an output from the decoder stage of the generator, a processed audio signal. described are further a method for training said generator and respective apparatus, systems and computer program products.