Lg electronics inc. (20240347065). ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF simplified abstract

From WikiPatents
Revision as of 02:20, 18 October 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF

Organization Name

lg electronics inc.

Inventor(s)

Anith Selvakumarasingam of Oshawa (CA)

Homa Fashandi of Toronto (CA)

ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240347065 titled 'ARTIFICIAL INTELLIGENCE DEVICE FOR ROBUST MULTIMODAL ENCODER FOR PERSON REPRESENTATIONS AND CONTROL METHOD THEREOF

    • Simplified Explanation:**

The patent application describes a method for controlling an artificial intelligence device using video and audio samples of a user to generate audio-visual embeddings for user verification.

    • Key Features and Innovation:**
  • Obtaining video and audio samples of a user
  • Generating visual and audio embeddings using a neural network
  • Creating audio-visual embeddings based on a combination of visual and audio embeddings
  • Verifying the user by comparing the generated embedding with pre-enrolled embeddings
  • Training the neural network using a loss function with audio-visual embeddings
    • Potential Applications:**

This technology can be used for secure user authentication, personalized user experiences, and enhanced human-computer interactions.

    • Problems Solved:**

The technology addresses the need for reliable user verification in AI devices, as well as the desire for more personalized and interactive AI experiences.

    • Benefits:**

The benefits include improved security, enhanced user experiences, and more efficient human-AI interactions.

    • Commercial Applications:**

"AI User Verification and Personalization Technology for Enhanced User Experiences"

    • Questions about AI:**

1. How does this technology improve user verification in AI devices? 2. What are the potential applications of audio-visual embeddings in AI technology?

    • Frequently Updated Research:**

Stay updated on advancements in neural network training methods for audio-visual embeddings and user verification in AI devices.


Original Abstract Submitted

a method for controlling an artificial intelligence (ai) device can include obtaining a video sample of a user and an audio sample of the user, generating, via a neural network, a visual embedding based on the video sample and an audio embedding based on the audio sample, the visual embedding and the audio embedding being multi-dimensional vectors, generating, via the neural network, an audio-visual embedding based on a combination of the visual and audio embeddings. the method can further include determining a specific pre-enrolled audio-visual embedding from among pre-enrolled audio-visual embeddings corresponding pre-enrolled users based on a distance away from the audio-visual embedding within a joint audio-visual subspace and verifying the user as the specific pre-enrolled user. also, the neural network can be trained based on a loss function that uses a plurality of audio-visual embeddings, each including an audio component and a visual component.