Google llc (20240233713). GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS simplified abstract

From WikiPatents
Jump to navigation Jump to search

GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Organization Name

google llc

Inventor(s)

Andrea Agostinelli of Zurich (CH)

Timo Immanuel Denk of Zurich (CH)

Antoine Caillon of Paris (FR)

Neil Zeghidour of Paris (FR)

Jesse Engel of Orinda CA (US)

Mauro Verzetti of Dübendorf (CH)

Christian Frank of Zurich (CH)

Zalán Borsos of Zurich (CH)

Matthew Sharifi of Kilchberg (CH)

Adam Joseph Roberts of Durham NC (US)

Marco Tagliasacchi of Kilchberg (CH)

GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240233713 titled 'GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Simplified Explanation: The patent application describes methods, systems, and apparatus for generating a prediction of an audio signal using neural networks.

  • **Key Features and Innovation:**
   - Receiving a request to generate an audio signal based on an input.
   - Processing the input with an embedding neural network to map it to embedding tokens.
   - Generating a semantic representation of the audio signal.
   - Using generative neural networks to create an acoustic representation of the audio signal based on the semantic representation and embedding tokens.
   - Utilizing a decoder neural network to produce the prediction of the audio signal.
  • **Potential Applications:**
   - Speech recognition technology.
   - Music composition and generation.
   - Audio enhancement in video editing.
  • **Problems Solved:**
   - Efficient prediction of audio signals.
   - Improved accuracy in generating audio representations.
   - Streamlined audio processing workflows.
  • **Benefits:**
   - Enhanced audio signal prediction capabilities.
   - Automation of audio generation tasks.
   - Increased efficiency in audio processing.
  • **Commercial Applications:**
   - AI-powered audio editing software for professionals.
   - Speech-to-text transcription services.
   - Music production tools with predictive capabilities.
  • **Prior Art:**
   - Researchers in the field of neural networks and audio signal processing.
   - Existing patents related to audio prediction and generation technologies.
  • **Frequently Updated Research:**
   - Ongoing advancements in neural network architectures for audio signal processing.
   - Latest developments in generative neural networks for audio synthesis.

Questions about audio signal prediction technology: 1. What are the potential limitations of using neural networks for audio signal prediction? 2. How does this technology compare to traditional methods of audio signal processing and prediction?


Original Abstract Submitted

methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. one of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.