Microsoft Technology Licensing, LLC (20240312477). Multichannel Audio Speech Classification simplified abstract

From WikiPatents
Jump to navigation Jump to search

Multichannel Audio Speech Classification

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Oron Nir of Herzeliya (IL)

Inbal Sagiv of Kfar-Saba (IL)

Maayan Yedidia of Ramat Gan (IL)

Fardau Van Neerden of Driel (NL)

Itai Norman of Tel Aviv (IL)

Multichannel Audio Speech Classification - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240312477 titled 'Multichannel Audio Speech Classification

    • Simplified Explanation:**

The patent application describes systems and methods for classifying multichannel audio speech. It involves receiving an audio signal with multiple channels, calculating average power values for each channel, and determining if the signal is speech-based communication.

    • Key Features and Innovation:**
  • Receiving an audio signal with multiple channels
  • Transcoding each channel to a predefined audio format
  • Calculating average power values for each channel
  • Determining correlation values between channels
  • Classifying the audio signal as speech-based communication based on correlation values
    • Potential Applications:**

This technology can be used in speech recognition systems, audio surveillance, voice-controlled devices, and audio content analysis.

    • Problems Solved:**

This technology helps in accurately classifying audio signals as speech-based communication, which is essential for various applications like speech recognition and audio content analysis.

    • Benefits:**
  • Improved accuracy in classifying audio signals
  • Enhanced performance of speech recognition systems
  • Better analysis of audio content for various applications
    • Commercial Applications:**

Potential commercial applications include speech recognition software, audio surveillance systems, voice-controlled devices, and audio content analysis tools. This technology can be valuable in industries such as telecommunications, security, and entertainment.

    • Prior Art:**

Prior art related to this technology may include research on audio signal processing, speech recognition systems, and audio content analysis methods.

    • Frequently Updated Research:**

Researchers are constantly working on improving audio signal processing techniques, speech recognition algorithms, and audio content analysis methods. Stay updated on the latest advancements in these fields for potential improvements in multichannel audio speech classification technology.

    • Questions about Multichannel Audio Speech Classification:**

1. How does multichannel audio speech classification differ from single-channel audio speech classification? 2. What are the key challenges in accurately classifying multichannel audio signals as speech-based communication?


Original Abstract Submitted

examples of the present disclosure describe systems and methods for multichannel audio speech classification. in examples, an audio signal comprising multiple audio channels is received at a processing device. each of the audio channels in the audio signal is transcoded to a predefined audio format. for each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. a correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. based on the classification, an action associated with the audio signal may be performed.