LG ELECTRONICS INC. (20240347065). ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF simplified abstract

From WikiPatents
Jump to navigation Jump to search

ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF

Organization Name

LG ELECTRONICS INC.

Inventor(s)

Anith Selvakumarasingam of Oshawa (CA)

Homa Fashandi of Toronto (CA)

ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240347065 titled 'ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF

    • Simplified Explanation:**

The patent application describes a method for controlling an artificial intelligence device using video and audio samples of a user to generate embeddings for verification purposes.

    • Key Features and Innovation:**
  • Obtaining video and audio samples of a user
  • Generating visual and audio embeddings using a neural network
  • Creating an audio-visual embedding based on a combination of the visual and audio embeddings
  • Verifying the user based on pre-enrolled audio-visual embeddings
  • Training the neural network using a loss function with multiple audio-visual embeddings
    • Potential Applications:**

This technology can be used for secure user verification in AI devices, such as voice assistants, smart home systems, and personalized content recommendations.

    • Problems Solved:**

The technology addresses the need for accurate and secure user verification in AI devices, ensuring that only authorized users can access sensitive information or personalized features.

    • Benefits:**
  • Enhanced security through multi-modal user verification
  • Personalized user experience based on individual audio and visual characteristics
  • Improved user privacy by preventing unauthorized access to AI devices
    • Commercial Applications:**

"Multi-Modal User Verification Technology for AI Devices" - This technology can be implemented in various industries, including smart home automation, healthcare, and financial services, to enhance security and user experience.

    • Questions about Multi-Modal User Verification Technology:**

1. How does this technology improve user privacy and security in AI devices? 2. What are the potential challenges in implementing this technology in real-world applications?

    • Frequently Updated Research:**

Researchers are continually exploring ways to improve the accuracy and efficiency of multi-modal user verification systems, incorporating advanced machine learning techniques and biometric authentication methods.


Original Abstract Submitted

a method for controlling an artificial intelligence (ai) device can include obtaining a video sample of a user and an audio sample of the user, generating, via a neural network, a visual embedding based on the video sample and an audio embedding based on the audio sample, the visual embedding and the audio embedding being multi-dimensional vectors, generating, via the neural network, an audio-visual embedding based on a combination of the visual and audio embeddings. the method can further include determining a specific pre-enrolled audio-visual embedding from among pre-enrolled audio-visual embeddings corresponding pre-enrolled users based on a distance away from the audio-visual embedding within a joint audio-visual subspace and verifying the user as the specific pre-enrolled user. also, the neural network can be trained based on a loss function that uses a plurality of audio-visual embeddings, each including an audio component and a visual component.