QUALCOMM Incorporated (20240419731). KNOWLEDGE-BASED AUDIO SCENE GRAPH

From WikiPatents
Jump to navigation Jump to search

KNOWLEDGE-BASED AUDIO SCENE GRAPH

Organization Name

QUALCOMM Incorporated

Inventor(s)

Arvind Krishna Sridhar of San Diego CA (US)

Yinyi Guo of San Diego CA (US)

Erik Visser of San Diego CA (US)

KNOWLEDGE-BASED AUDIO SCENE GRAPH

This abstract first appeared for US patent application 20240419731 titled 'KNOWLEDGE-BASED AUDIO SCENE GRAPH



Original Abstract Submitted

a device includes a processor configured to obtain a first audio embedding of a first audio segment and obtain a first text embedding of a first tag assigned to the first audio segment. the first audio segment corresponds to a first audio event of audio events. the processor is configured to obtain a first event representation based on a combination of the first audio embedding and the first text embedding. the processor is configured to obtain a second event representation of a second audio event of the audio events. the processor is also configured to determine, based on knowledge data, relations between the audio events. the processor is configured to construct an audio scene graph based on a temporal order of the audio events. the audio scene graph constructed to include a first node corresponding to the first audio event and a second node corresponding to the second audio event.