Jump to content

Google LLC (20250006217). Automatic Speech Recognition Accuracy With Multimodal Embeddings Search

From WikiPatents

Automatic Speech Recognition Accuracy With Multimodal Embeddings Search

Organization Name

Google LLC

Inventor(s)

Christopher Li of New York NY US

Kyle Scott Kastner of Waltham MA US

Yuan Wang of Hoboken NJ US

Zhehuai Chen of Edgewater NJ US

Andrew Maxwell Rosenberg of Brooklyn NY US

Heng Su of Beijing CN

Qian Chen of Beijing CN

Leonid Aleksandrovich Velikovich of New York NY US

Patrick Maxim Rondon of New York NY US

Diamantino Antonio Caseiro of Philadelphia PA US

Zelin Wu of Jersey City NJ US

Automatic Speech Recognition Accuracy With Multimodal Embeddings Search

This abstract first appeared for US patent application 20250006217 titled 'Automatic Speech Recognition Accuracy With Multimodal Embeddings Search

Original Abstract Submitted

a method includes receiving training data that includes a set of transcribed speech utterances where each respective transcribed speech utterance is paired with a corresponding transcription. for each respective transcribed speech utterance, the method includes generating an encoded audio representation and an encoded textual representation, generating a higher order audio feature representation for a corresponding encoded audio representation, generating a higher order textual feature representation for a corresponding encoded textual representation, and determining a loss for the respective transcribed speech utterance based on the higher order audio feature representation and the higher order textual feature representation. the method also includes training a speech encoder and a text encoder of a correction model based on the loss determined for each transcribed speech utterance of the set of transcribed speech utterances.

(Ad) Transform your business with AI in minutes, not months

Custom AI strategy tailored to your specific industry needs
Step-by-step implementation with measurable ROI
5-minute setup that requires zero technical skills
Get your AI playbook

Trusted by 1,000+ companies worldwide

Cookies help us deliver our services. By using our services, you agree to our use of cookies.