ONLINE SPEAKER DIARIZATION USING LOCAL AND GLOBAL CLUSTERING

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Myungjong Kim of Milpitas CA (US)

Taeyeon Ki of Milpitas CA (US)

Vijendra Raj Apsingekar of San Jose CA (US)

Sungjae Park of Seoul (KR)

SeungBeom Ryu of Suwon (KR)

Hyuk Oh of Seoul (KR)

ONLINE SPEAKER DIARIZATION USING LOCAL AND GLOBAL CLUSTERING - A simplified explanation of the abstract

This abstract first appeared for US patent application 18046041 titled 'ONLINE SPEAKER DIARIZATION USING LOCAL AND GLOBAL CLUSTERING

Simplified Explanation

The patent application describes a method for speaker identification using audio streams containing speech activity. The method involves generating embedding vectors for each segment of the audio stream and clustering them into one or more clusters for speaker identification.

Obtaining an audio stream with speech activity
Generating embedding vectors for each segment of the audio stream
Clustering the embedding vectors into clusters for speaker identification
Presenting sequences of speaker identities based on the speaker identification performed for local and global windows

Potential applications of this technology:

Speaker identification in call centers or customer service applications
Voice recognition and authentication systems
Forensic analysis of audio recordings
Automatic transcription and captioning services

Problems solved by this technology:

Efficient and accurate speaker identification in audio streams with multiple speakers
Reducing the need for manual speaker identification and annotation
Handling variations in speech patterns and accents

Benefits of this technology:

Improved accuracy and reliability in speaker identification
Time-saving and cost-effective compared to manual identification methods
Scalable for large audio datasets
Can be integrated into existing speech processing systems

Original Abstract Submitted

A method includes obtaining at least a portion of an audio stream containing speech activity. At least the portion of the audio stream includes multiple segments. The method also includes, for each of the multiple segments, generating an embedding vector that represents the segment. The method further includes, within each of multiple local windows, clustering the embedding vectors into one or more clusters to perform speaker identification. Different clusters correspond to different speakers. The method also includes presenting at least one first sequence of speaker identities based on the speaker identification performed for the local windows. The method further includes, within each of multiple global windows, clustering the embedding vectors into one or more clusters to perform speaker identification. Each global window includes two or more local windows. In addition, the method includes presenting at least one second sequence of speaker identities based on the speaker identification performed for the global windows.

18046041. ONLINE SPEAKER DIARIZATION USING LOCAL AND GLOBAL CLUSTERING simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

Contents

ONLINE SPEAKER DIARIZATION USING LOCAL AND GLOBAL CLUSTERING

Organization Name

Inventor(s)

ONLINE SPEAKER DIARIZATION USING LOCAL AND GLOBAL CLUSTERING - A simplified explanation of the abstract

Simplified Explanation

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools