Robert Bosch GmbH (20250022296). KNOWLEDGE-DRIVEN SCENE PRIORS FOR SEMANTIC AUDIO-VISUAL EMBODIED NAVIGATION
KNOWLEDGE-DRIVEN SCENE PRIORS FOR SEMANTIC AUDIO-VISUAL EMBODIED NAVIGATION
Organization Name
Inventor(s)
Jonathan Francis of Pittsburgh PA US
Luca Bondi of Pittsburgh PA US
Ingrid Navarro of Pittsburgh PA US
KNOWLEDGE-DRIVEN SCENE PRIORS FOR SEMANTIC AUDIO-VISUAL EMBODIED NAVIGATION
This abstract first appeared for US patent application 20250022296 titled 'KNOWLEDGE-DRIVEN SCENE PRIORS FOR SEMANTIC AUDIO-VISUAL EMBODIED NAVIGATION
Original Abstract Submitted
a method of controlling navigation of a device in an environment using machine learning (ml) models includes receiving visual and audio observation data of the environment as sensed by the device, determining classification scores for objects and regions in the environment based on the visual and audio observation data, encoding visual information based on the classification scores, determining audio-semantic feature embeddings based at least in part on the classification scores, the audio-semantic feature embeddings indicating spatial relationships between objects in the environment, between regions in the environment, and between objects and regions in the environment, and determining and outputting, based on the encoded visual information and the audio-semantic feature embeddings, a state representation corresponding to a state of the device within the environment.