Hyundai Motor Company (20240346730). FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE simplified abstract

From WikiPatents
Jump to navigation Jump to search

FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE

Organization Name

Hyundai Motor Company

Inventor(s)

Byeong Yeol Kim of Seoul (KR)

Ji Hwan Park of Seoul (KR)

You Shin Lim of Yongin-si (KR)

FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240346730 titled 'FACE IMAGE GENERATION METHOD AND DEVICE FOR GENERATING FULLY-CONTROLLABLE TALKING FACE

Simplified Explanation

The patent application describes a method and device for generating a controllable talking face image. This involves encoding source images, driving images, and input audio to create a talking face image using a generative adversarial network.

  • Receiving source images, driving images, and input audio
  • Encoding source and driving images to create a visual space
  • Encoding input audio to create an audio feature
  • Mapping source latent code to a canonical space
  • Combining driving and audio latent codes to create a motion code
  • Generating a talking face image using a generative adversarial network

Key Features and Innovation

  • Generation of controllable talking face images
  • Encoding of source and driving images into a visual space
  • Mapping of audio features to create a motion code
  • Combination of canonical and motion codes for image generation

Potential Applications

This technology can be used in video conferencing, virtual reality, entertainment industry, and online communication platforms.

Problems Solved

This technology addresses the need for realistic and controllable talking face images for various applications.

Benefits

  • Enhanced communication experiences
  • Realistic and controllable face image generation
  • Improved user engagement in virtual environments

Commercial Applications

  • Virtual reality applications
  • Video conferencing software
  • Entertainment industry for special effects

Prior Art

Prior research in facial image generation using deep learning techniques can be relevant to this technology.

Frequently Updated Research

Research on improving the realism and controllability of generated face images using advanced neural network architectures is ongoing.

Questions about Face Image Generation

How does this technology improve virtual communication experiences?

This technology enhances virtual communication by providing realistic and controllable talking face images, improving user engagement and interaction.

What are the potential applications of this face image generation method?

The potential applications include video conferencing, virtual reality, entertainment industry, and online communication platforms.


Original Abstract Submitted

provided are face image generation method and device for generating a controllable talking face image. the method includes: a face image generation method for generating a controllable talking face image, the method comprising: receiving a source image and a series of driving images, sampled from the same video, and input audio; acquiring a style latent code including a source latent code and a driving latent code by encoding the source image and the series of driving images into a visual space by a visual encoder; acquiring an audio feature including an audio latent code by encoding the input audio by an audio encoder; acquiring a canonical code by mapping the source latent code to a canonical space by a canonical encoder; acquiring a motion code by combining the driving latent code with the audio latent code, and mapping the combined code to a multimodal motion space by a multimodal motion encoder; acquiring a multimodal fused latent code by combining the canonical code with the motion code; and generating a talking face image by transferring the multimodal fused latent code to a generative adversarial network (gan).