ATTENTIVE SCORING FUNCTION FOR SPEAKER IDENTIFICATION

Organization Name

GOOGLE LLC

Inventor(s)

Ignacio Lopez Moreno of New York NY (US)

Quan Wang of Hoboken NJ (US)

Jason Pelecanos of Mountain View CA (US)

Yiling Huang of Mountain View CA (US)

Mert Saglam of Mountain View CA (US)

ATTENTIVE SCORING FUNCTION FOR SPEAKER IDENTIFICATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240029742 titled 'ATTENTIVE SCORING FUNCTION FOR SPEAKER IDENTIFICATION

Simplified Explanation

The abstract describes a method for speaker verification using audio data. The method involves processing the audio data to generate a reference attentive d-vector that represents the voice characteristics of the utterance. The evaluation ad-vector includes style classes with respective value vectors concatenated with corresponding routing vectors. The method also uses a self-attention mechanism to generate multi-condition attention scores that indicate the likelihood of the evaluation ad-vector matching a reference ad-vector associated with a user. The speaker of the utterance is identified as the user associated with the reference ad-vector based on the multi-condition attention score.

The method receives audio data of an utterance.
The audio data is processed to generate a reference attentive d-vector representing voice characteristics.
The evaluation ad-vector includes style classes with value vectors and routing vectors.
Multi-condition attention scores are generated using a self-attention mechanism.
The multi-condition attention scores indicate the likelihood of matching between evaluation and reference ad-vectors.
The speaker of the utterance is identified based on the multi-condition attention score.

Potential Applications:

Speaker verification/authentication systems
Voice-controlled devices and virtual assistants
Call center authentication and fraud detection

Problems Solved:

Ensures accurate speaker verification by analyzing voice characteristics
Improves security and prevents unauthorized access to systems
Reduces the risk of identity theft and fraud

Benefits:

Enhanced user experience with voice-controlled devices
Increased security and protection against unauthorized access
Improved accuracy and reliability of speaker verification systems

Original Abstract Submitted

a speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes nstyle classes each including a respective value vector concatenated with a corresponding routing vector. the method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. the method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.