Samsung electronics co., ltd. (20240119943). APPARATUS FOR IMPLEMENTING SPEAKER DIARIZATION MODEL, METHOD OF SPEAKER DIARIZATION, AND PORTABLE TERMINAL INCLUDING THE APPARATUS simplified abstract

From WikiPatents
Jump to navigation Jump to search

APPARATUS FOR IMPLEMENTING SPEAKER DIARIZATION MODEL, METHOD OF SPEAKER DIARIZATION, AND PORTABLE TERMINAL INCLUDING THE APPARATUS

Organization Name

samsung electronics co., ltd.

Inventor(s)

Wan Ju Kang of Daejeon (KR)

Sungju Lee of Daejeon (KR)

Ryuhaerang Choi of Daejeon (KR)

APPARATUS FOR IMPLEMENTING SPEAKER DIARIZATION MODEL, METHOD OF SPEAKER DIARIZATION, AND PORTABLE TERMINAL INCLUDING THE APPARATUS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240119943 titled 'APPARATUS FOR IMPLEMENTING SPEAKER DIARIZATION MODEL, METHOD OF SPEAKER DIARIZATION, AND PORTABLE TERMINAL INCLUDING THE APPARATUS

Simplified Explanation

The speaker diarization model implementing apparatus described in the abstract is a system that can separate multiple speakers from each other using voice signals and motion sensing signals. Here is a simplified explanation of the patent application:

  • Voice signal analysis module generates mel-spectrogram data from voice signals of multiple speakers.
  • Motion data analysis module generates ultra-wideband (uwb) signal matrix data from motion sensing signals of the speakers.
  • Multimodal learning module extracts characteristic values from the mel-spectrogram data and uwb signal matrix data.
  • Speaker diarization module separates the speakers using the characteristic values.
      1. Potential Applications

This technology can be used in conference call systems, surveillance systems, and voice-controlled devices to accurately identify and separate speakers.

      1. Problems Solved

This technology solves the problem of accurately identifying and separating multiple speakers in various audio and motion sensing applications.

      1. Benefits

The benefits of this technology include improved accuracy in speaker identification, enhanced user experience in voice-controlled devices, and better organization of audio data in surveillance systems.

      1. Potential Commercial Applications

A potential commercial application of this technology could be in the development of advanced conference call systems for businesses.

      1. Possible Prior Art

One possible prior art for speaker diarization technology is the use of machine learning algorithms to separate speakers in audio recordings.

        1. Unanswered Questions
        1. How does this technology handle background noise in speaker diarization?

The abstract does not mention how the apparatus deals with background noise that may affect speaker separation accuracy.

        1. Can this technology be integrated with existing voice recognition systems?

It is not clear from the abstract whether this speaker diarization model can be easily integrated with current voice recognition technologies.


Original Abstract Submitted

a speaker diarization model implementing apparatus includes a voice signal analysis module configured to generate mel-spectrogram data from voice signals of a plurality of speakers detected by a voice recognition device, a motion data analysis module configured to generate ultra-wideband (uwb) signal matrix data from motion sensing signals of the plurality of speakers detected by a motion recognition device, a multimodal learning module configured to extract characteristic values based on the mel-spectrogram data and the uwb signal matrix data, and a speaker diarization module configured to separate the plurality of speakers from each other using the characteristic values.