Adobe Inc. (20240257819). VOICE AUDIO COMPRESSION USING NEURAL NETWORKS simplified abstract

From WikiPatents
Jump to navigation Jump to search

VOICE AUDIO COMPRESSION USING NEURAL NETWORKS

Organization Name

Adobe Inc.

Inventor(s)

Tiberiu Boros of Bucharest (RO)

Stefan Daniel Dumitrescu of Bucharest (RO)

Andrei Cotaie of Bucharest (RO)

Joseph Davidson of Farmington UT (US)

Alexandru Constantin Calistru of Magurele (RO)

Ionut Daniel Barbu of Bucharest (RO)

VOICE AUDIO COMPRESSION USING NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240257819 titled 'VOICE AUDIO COMPRESSION USING NEURAL NETWORKS

Simplified Explanation

The patent application discloses a system for training an audio processing system to encode and decode speech audio using neural networks.

  • Receiving an audio sequence containing speech audio.
  • Generating pitch data to represent the detected pitch in the audio sequence.
  • Passing the audio sequence through an audio encoder to create a vector representation.
  • Using a vector quantizer to encode the vector representation using a codebook of discrete vectors.
  • Reconstructing the audio sequence using the pitch data and the encoded vector representation.

Key Features and Innovation

  • Training an audio processing system to perform high-quality speech audio encoding and decoding.
  • Using neural networks to enhance the encoding and decoding process.
  • Generating pitch data to improve the accuracy of the reconstruction.
  • Utilizing a vector quantizer to efficiently encode the audio sequence.

Potential Applications

The technology can be applied in various fields such as:

  • Speech recognition systems
  • Audio compression algorithms
  • Voice-controlled devices

Problems Solved

  • Improving the quality of speech audio encoding and decoding.
  • Enhancing the efficiency of audio processing systems.
  • Increasing the accuracy of pitch detection in audio sequences.

Benefits

  • Higher quality speech audio encoding and decoding.
  • Improved performance of audio processing systems.
  • Enhanced accuracy in pitch detection.

Commercial Applications

  • Title: Advanced Speech Audio Encoding and Decoding Technology
  • This technology can be used in industries such as telecommunications, entertainment, and artificial intelligence.
  • It can improve the performance of voice recognition software and audio streaming services.

Prior Art

Research on neural network-based audio processing systems and speech recognition technologies can provide insights into prior art related to this technology.

Frequently Updated Research

Stay updated on advancements in neural network algorithms, audio processing techniques, and speech recognition systems to enhance the capabilities of this technology.

Questions about Audio Processing Technology

1. How does this technology improve the efficiency of speech audio encoding and decoding? 2. What are the potential applications of neural network-based audio processing systems in the future?


Original Abstract Submitted

embodiments are disclosed for training an audio processing system to perform high-quality speech audio encoding and decoding using neural networks. in particular, in one or more embodiments, the disclosed systems and methods comprise receiving an audio sequence, the audio sequence including speech audio, generating pitch data representing detected pitch within the audio sequence, passing the audio sequence through an audio encoder to generate a vector representation of the audio sequence, generating, by a vector quantizer, an encoded vector representation of the audio sequence using the vector representation of the audio sequence and a codebook of discrete vectors, and reconstructing, by an audio decoder, the audio sequence using the pitch data and the encoded vector representation of the audio sequence.