18047609. SYSTEM AND METHOD FOR SPEAKER VERIFICATION FOR VOICE ASSISTANT simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

SYSTEM AND METHOD FOR SPEAKER VERIFICATION FOR VOICE ASSISTANT

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Myungjong Kim of Milpitas CA (US)

Taeyeon Ki of Milpitas CA (US)

Cindy Sushen Tseng of Santa Clara CA (US)

Srinivasa Rao Ponakala of Sunnyvale CA (US)

Vijendra Raj Apsingekar of San Jose CA (US)

SYSTEM AND METHOD FOR SPEAKER VERIFICATION FOR VOICE ASSISTANT - A simplified explanation of the abstract

This abstract first appeared for US patent application 18047609 titled 'SYSTEM AND METHOD FOR SPEAKER VERIFICATION FOR VOICE ASSISTANT

Simplified Explanation

The patent application describes a method for speaker identification using a wake word or phrase. Here are the key points:

  • The method starts by obtaining audio data and identifying a wake word or phrase in the audio.
  • An embedding vector is generated based on the identified wake word or phrase.
  • A set of previously-generated vectors representing previous utterances of the wake word or phrase is accessed.
  • Clustering is performed on the embedding vector and the set of previously-generated vectors to identify a cluster associated with a speaker.
  • A speaker vector associated with the speaker is updated based on the embedding vector.
  • A speaker verification model is used to determine a similarity score between the updated speaker vector and the embedding vector.
  • Based on the similarity score, it is determined whether the speaker providing the utterance matches the speaker associated with the identified cluster.

Potential Applications

This technology has potential applications in various fields, including:

  • Voice-controlled devices: The method can be used to accurately identify the speaker and personalize the user experience on voice-controlled devices such as smart speakers or virtual assistants.
  • Security systems: Speaker identification can enhance the security of access control systems by verifying the identity of individuals based on their voice.
  • Call center analytics: The method can be used to identify and track speakers in call center recordings, enabling better analysis and monitoring of customer interactions.

Problems Solved

The method addresses the following problems:

  • Speaker identification: By using clustering and embedding vectors, the method can accurately identify speakers based on their utterances, even in the presence of noise or variations in speech patterns.
  • Personalization: The method allows for personalized user experiences by associating specific speakers with their preferences or settings.
  • Security: By verifying the identity of speakers, the method enhances the security of voice-controlled devices and access control systems.

Benefits

The use of this technology offers several benefits:

  • Improved user experience: Personalized responses and tailored interactions can be provided to individual speakers, enhancing the overall user experience.
  • Enhanced security: By accurately identifying speakers, the method improves the security of voice-controlled devices and access control systems, preventing unauthorized access.
  • Efficient call center analytics: The method enables better analysis and monitoring of customer interactions in call centers, leading to improved customer service and support.


Original Abstract Submitted

A method includes obtaining audio data and identifying an utterance of a wake word or phrase in the audio data. The method also includes generating an embedding vector based on the utterance from the audio data and accessing a set of previously-generated vectors representing previous utterances of the wake word or phrase. The method further includes performing clustering on the embedding vector and the set of previously-generated vectors to identify a cluster including the embedding vector, where the identified cluster is associated with a speaker. The method also includes updating a speaker vector associated with the speaker based on the embedding vector and determining, using a speaker verification model, a similarity score between the updated speaker vector and the embedding vector. In addition, the method includes determining, based on the similarity score, whether a speaker providing the utterance matches the speaker associated with the identified cluster.