18396788. Multichannel Audio Speech Classification simplified abstract (Microsoft Technology Licensing, LLC)

From WikiPatents
Jump to navigation Jump to search

Multichannel Audio Speech Classification

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Oron Nir of Herzeliya (IL)

Inbal Sagiv of Kfar-Saba (IL)

Maayan Yedidia of Ramat Gan (IL)

Fardau Van Neerden of Driel (NL)

Itai Norman of Tel Aviv (IL)

Multichannel Audio Speech Classification - A simplified explanation of the abstract

This abstract first appeared for US patent application 18396788 titled 'Multichannel Audio Speech Classification

The present disclosure describes systems and methods for multichannel audio speech classification. An audio signal with multiple audio channels is received and transcoded to a predefined audio format. Average power values are calculated for each audio channel, and correlation values are determined between these values and the combined average power value of the other channels. The correlation values are compared against a threshold to classify the audio signal as speech-based communication, triggering associated actions.

  • Audio signal with multiple channels received and transcoded
  • Average power values calculated for each channel
  • Correlation values determined between power values of channels
  • Comparison against threshold for speech-based communication classification
  • Associated actions triggered based on classification
    • Potential Applications:**

This technology can be used in speech recognition systems, audio surveillance, and voice-controlled devices.

    • Problems Solved:**

This technology addresses the need for accurate and efficient speech classification in multichannel audio signals.

    • Benefits:**

Improved accuracy in speech classification, enhanced performance in audio processing systems, and better user experience in voice-controlled devices.

    • Commercial Applications:**

Title: "Enhanced Speech Classification Technology for Audio Systems" This technology can be applied in security systems, smart home devices, and telecommunication equipment to enhance speech recognition capabilities and improve user interactions.

    • Prior Art:**

Prior research in audio signal processing, speech recognition, and machine learning algorithms may provide insights into similar technologies.

    • Frequently Updated Research:**

Stay updated on advancements in audio signal processing, machine learning models for speech classification, and applications of multichannel audio analysis in various industries.

    • Questions about Multichannel Audio Speech Classification:**

1. How does this technology improve the accuracy of speech classification in audio signals? 2. What are the potential applications of multichannel audio speech classification beyond speech recognition systems?


Original Abstract Submitted

Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.