TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED (20240290338). SPEECH PROCESSING simplified abstract

From WikiPatents
Jump to navigation Jump to search

SPEECH PROCESSING

Organization Name

TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor(s)

Jun Huang of Shenzhen (CN)

Yannan Wang of Shenzhen (CN)

SPEECH PROCESSING - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240290338 titled 'SPEECH PROCESSING

Simplified Explanation

An audio processing method enhances audio data by iteratively training an audio enhancement model using deep clustering and mask inference loss functions. The model generates target audio data with reduced noise and reverberation based on initial audio features.

  • Initial audio feature obtained from initial audio data
  • Input to audio enhancement model
  • Model trained iteratively with deep clustering and mask inference loss functions
  • Generates target audio data with reduced noise and reverberation
  • Target audio feature generated by model based on initial audio feature
  • Output target audio data

Key Features and Innovation

  • Utilizes deep clustering and mask inference loss functions for training
  • Reduces noise and reverberation in audio data
  • Generates target audio data based on initial audio features

Potential Applications

  • Audio post-processing in music production
  • Noise reduction in speech recognition systems
  • Enhancing audio quality in video conferencing applications

Problems Solved

  • Improves audio quality by reducing noise and reverberation
  • Enhances the clarity of audio data for various applications

Benefits

  • Improved audio quality for better user experience
  • Enhanced performance of audio processing systems
  • Increased accuracy in speech recognition and audio analysis tasks

Commercial Applications

  • Audio enhancement software for content creators
  • Integration into communication devices for clearer audio transmission
  • Implementation in smart home devices for improved voice recognition

Questions about Audio Processing

How does the audio enhancement model reduce noise and reverberation in audio data?

The model uses deep clustering and mask inference loss functions to iteratively train and generate target audio data with reduced noise and reverberation.

What are the potential applications of this audio processing technology beyond noise reduction?

This technology can be applied in various fields such as music production, speech recognition, and video conferencing for enhancing audio quality and clarity.


Original Abstract Submitted

in an audio processing method, an initial audio feature of initial audio data is obtained. the initial audio feature is input to an audio enhancement model. the audio enhancement model is iteratively trained based on a deep clustering loss function and a mask inference loss function. target audio data with reduced noise and reverberation is calculated according to a target audio feature. the target audio feature is generated by the audio enhancement model based on the initial audio feature. the target audio data is output.