FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE

Organization Name

Inventor(s)

FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240346730 titled 'FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE

Simplified Explanation

The patent application describes a method and device for generating a controllable talking face image. This involves encoding source images, driving images, and input audio to create a talking face image using a generative adversarial network.

Receiving source images, driving images, and input audio
Encoding source and driving images to create a visual space
Encoding input audio to create an audio feature
Mapping source latent code to a canonical space
Combining driving and audio latent codes to create a motion code
Generating a talking face image using a generative adversarial network

Key Features and Innovation

Generation of controllable talking face images
Encoding of source and driving images into a visual space
Mapping of audio features to create a motion code
Combination of canonical and motion codes for image generation

Potential Applications

This technology can be used in video conferencing, virtual reality, entertainment industry, and online communication platforms.

Problems Solved

This technology addresses the need for realistic and controllable talking face images for various applications.

Benefits

Enhanced communication experiences
Realistic and controllable face image generation
Improved user engagement in virtual environments

Commercial Applications

Virtual reality applications
Video conferencing software
Entertainment industry for special effects

Prior Art

Prior research in facial image generation using deep learning techniques can be relevant to this technology.

Frequently Updated Research

Research on improving the realism and controllability of generated face images using advanced neural network architectures is ongoing.

Questions about Face Image Generation

How does this technology improve virtual communication experiences?

This technology enhances virtual communication by providing realistic and controllable talking face images, improving user engagement and interaction.

What are the potential applications of this face image generation method?

The potential applications include video conferencing, virtual reality, entertainment industry, and online communication platforms.

Original Abstract Submitted

provided are face image generation method and device for generating a controllable talking face image. the method includes: a face image generation method for generating a controllable talking face image, the method comprising: receiving a source image and a series of driving images, sampled from the same video, and input audio; acquiring a style latent code including a source latent code and a driving latent code by encoding the source image and the series of driving images into a visual space by a visual encoder; acquiring an audio feature including an audio latent code by encoding the input audio by an audio encoder; acquiring a canonical code by mapping the source latent code to a canonical space by a canonical encoder; acquiring a motion code by combining the driving latent code with the audio latent code, and mapping the combined code to a multimodal motion space by a multimodal motion encoder; acquiring a multimodal fused latent code by combining the canonical code with the motion code; and generating a talking face image by transferring the multimodal fused latent code to a generative adversarial network (gan).

Hyundai Motor Company (20240346730). FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE simplified abstract

Contents

FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE

Organization Name

Inventor(s)