18412394. GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Organization Name

GOOGLE LLC

Inventor(s)

Andrea Agostinelli of Zurich (CH)

Timo Immanuel Denk of Zurich (CH)

Antoine Caillon of Paris (FR)

Neil Zeghidour of Paris (FR)

Jesse Engel of Orinda CA (US)

Mauro Verzetti of Dübendorf (CH)

Christian Frank of Zurich (CH)

Zalán Borsos of Zurich (CH)

Matthew Sharifi of Kilchberg (CH)

Adam Joseph Roberts of Durham NC (US)

Marco Tagliasacchi of Kilchberg (CH)

GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18412394 titled 'GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Simplified Explanation: The patent application describes methods, systems, and apparatus for generating a prediction of an audio signal using neural networks.

  • Key Features and Innovation:
   * Receiving a request to generate an audio signal based on an input.
   * Processing the input with an embedding neural network to map it to embedding tokens.
   * Generating a semantic representation of the audio signal.
   * Using generative neural networks to create an acoustic representation of the audio signal based on the semantic representation and embedding tokens.
   * Utilizing a decoder neural network to generate the prediction of the audio signal.

Potential Applications: This technology could be applied in speech recognition systems, music generation software, and audio processing tools.

Problems Solved: The technology addresses the challenge of accurately predicting audio signals based on input data, improving the quality of generated audio.

Benefits: The benefits include enhanced audio signal prediction accuracy, improved performance of audio processing systems, and the ability to generate realistic audio outputs.

Commercial Applications: Potential commercial applications include speech-to-text software, music production tools, and virtual assistant technology.

Prior Art: Researchers can explore prior art related to neural network-based audio signal prediction systems and generative models in the field of audio processing.

Frequently Updated Research: Stay informed about the latest advancements in neural network technologies for audio signal processing and prediction to enhance the understanding and implementation of this innovation.

Questions about Audio Signal Prediction: 1. How does the use of neural networks improve the accuracy of audio signal prediction? 2. What are the potential limitations of using generative neural networks in audio signal prediction?


Original Abstract Submitted

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.