Google llc (20250006184). MULTIMODAL INTENT UNDERSTANDING FOR AUTOMATED ASSISTANT
MULTIMODAL INTENT UNDERSTANDING FOR AUTOMATED ASSISTANT
Organization Name
Inventor(s)
Matthew Sharifi of Kilchberg CH
MULTIMODAL INTENT UNDERSTANDING FOR AUTOMATED ASSISTANT
This abstract first appeared for US patent application 20250006184 titled 'MULTIMODAL INTENT UNDERSTANDING FOR AUTOMATED ASSISTANT
Original Abstract Submitted
implementations described herein include detecting a stream of audio data that captures a spoken utterance of the user and that captures ambient noise occurring within a threshold time period of the spoken utterance being spoken by the user. implementations further include processing a portion of the audio data that includes the ambient noise to determine ambient noise classification(s), processing a portion of the audio data that includes the spoken utterance to generate a transcription, processing both the transcription and the ambient noise classification(s) with a machine learning model to generate a user intent and parameter(s) for the user intent, and performing one or more automated assistant actions based on the user intent and using the parameter(s).