Adobe Inc. (20240257819). VOICE AUDIO COMPRESSION USING NEURAL NETWORKS simplified abstract
Contents
- 1 VOICE AUDIO COMPRESSION USING NEURAL NETWORKS
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 VOICE AUDIO COMPRESSION USING NEURAL NETWORKS - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Key Features and Innovation
- 1.6 Potential Applications
- 1.7 Problems Solved
- 1.8 Benefits
- 1.9 Commercial Applications
- 1.10 Prior Art
- 1.11 Frequently Updated Research
- 1.12 Questions about Audio Processing Technology
- 1.13 Original Abstract Submitted
VOICE AUDIO COMPRESSION USING NEURAL NETWORKS
Organization Name
Inventor(s)
Tiberiu Boros of Bucharest (RO)
Stefan Daniel Dumitrescu of Bucharest (RO)
Andrei Cotaie of Bucharest (RO)
Joseph Davidson of Farmington UT (US)
Alexandru Constantin Calistru of Magurele (RO)
Ionut Daniel Barbu of Bucharest (RO)
VOICE AUDIO COMPRESSION USING NEURAL NETWORKS - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240257819 titled 'VOICE AUDIO COMPRESSION USING NEURAL NETWORKS
Simplified Explanation
The patent application discloses a system for training an audio processing system to encode and decode speech audio using neural networks.
- Receiving an audio sequence containing speech audio.
- Generating pitch data to represent the detected pitch in the audio sequence.
- Passing the audio sequence through an audio encoder to create a vector representation.
- Using a vector quantizer to encode the vector representation using a codebook of discrete vectors.
- Reconstructing the audio sequence using the pitch data and the encoded vector representation.
Key Features and Innovation
- Training an audio processing system to perform high-quality speech audio encoding and decoding.
- Using neural networks to enhance the encoding and decoding process.
- Generating pitch data to improve the accuracy of the reconstruction.
- Utilizing a vector quantizer to efficiently encode the audio sequence.
Potential Applications
The technology can be applied in various fields such as:
- Speech recognition systems
- Audio compression algorithms
- Voice-controlled devices
Problems Solved
- Improving the quality of speech audio encoding and decoding.
- Enhancing the efficiency of audio processing systems.
- Increasing the accuracy of pitch detection in audio sequences.
Benefits
- Higher quality speech audio encoding and decoding.
- Improved performance of audio processing systems.
- Enhanced accuracy in pitch detection.
Commercial Applications
- Title: Advanced Speech Audio Encoding and Decoding Technology
- This technology can be used in industries such as telecommunications, entertainment, and artificial intelligence.
- It can improve the performance of voice recognition software and audio streaming services.
Prior Art
Research on neural network-based audio processing systems and speech recognition technologies can provide insights into prior art related to this technology.
Frequently Updated Research
Stay updated on advancements in neural network algorithms, audio processing techniques, and speech recognition systems to enhance the capabilities of this technology.
Questions about Audio Processing Technology
1. How does this technology improve the efficiency of speech audio encoding and decoding? 2. What are the potential applications of neural network-based audio processing systems in the future?
Original Abstract Submitted
embodiments are disclosed for training an audio processing system to perform high-quality speech audio encoding and decoding using neural networks. in particular, in one or more embodiments, the disclosed systems and methods comprise receiving an audio sequence, the audio sequence including speech audio, generating pitch data representing detected pitch within the audio sequence, passing the audio sequence through an audio encoder to generate a vector representation of the audio sequence, generating, by a vector quantizer, an encoded vector representation of the audio sequence using the vector representation of the audio sequence and a codebook of discrete vectors, and reconstructing, by an audio decoder, the audio sequence using the pitch data and the encoded vector representation of the audio sequence.