20240055015. LEARNING METHOD FOR GENERATING LIP SYNC IMAGE BASED ON MACHINE LEARNING AND LIP SYNC IMAGE GENERATION DEVICE FOR PERFORMING SAME simplified abstract (DEEPBRAIN AI INC.)

From WikiPatents
Jump to navigation Jump to search

LEARNING METHOD FOR GENERATING LIP SYNC IMAGE BASED ON MACHINE LEARNING AND LIP SYNC IMAGE GENERATION DEVICE FOR PERFORMING SAME

Organization Name

DEEPBRAIN AI INC.

Inventor(s)

Gyeong Su Chae of Seoul (KR)

LEARNING METHOD FOR GENERATING LIP SYNC IMAGE BASED ON MACHINE LEARNING AND LIP SYNC IMAGE GENERATION DEVICE FOR PERFORMING SAME - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240055015 titled 'LEARNING METHOD FOR GENERATING LIP SYNC IMAGE BASED ON MACHINE LEARNING AND LIP SYNC IMAGE GENERATION DEVICE FOR PERFORMING SAME

Simplified Explanation

The abstract describes a lip sync image generation device based on machine learning. It consists of an image synthesis model and a lip sync discrimination model. The image synthesis model uses a person background image and an utterance audio signal as input to generate a lip sync image. The lip sync discrimination model then determines the degree of match between the generated lip sync image and the input utterance audio signal.

  • The device uses an artificial neural network model to generate lip sync images based on input images and audio signals.
  • Another artificial neural network model is used to discriminate the degree of match between the generated lip sync image and the input audio signal.
  • The image synthesis model and lip sync discrimination model work together to generate accurate lip sync images.

Potential applications of this technology:

  • Lip syncing for animated characters in movies, TV shows, and video games.
  • Creating realistic virtual avatars that can accurately lip sync to audio.
  • Improving the quality of dubbing and voice-over in media productions.

Problems solved by this technology:

  • Overcoming the challenge of accurately synchronizing lip movements with audio in animation and virtual avatars.
  • Reducing the time and effort required for manual lip syncing in media production.
  • Enhancing the realism and immersion of virtual characters by improving their lip sync capabilities.

Benefits of this technology:

  • Improved efficiency and cost-effectiveness in media production processes.
  • Enhanced user experience in virtual reality and augmented reality applications.
  • Enables the creation of more realistic and engaging virtual characters.


Original Abstract Submitted

a lip sync image generation device based on machine learning according to a disclosed embodiment includes an image synthesis model, which is an artificial neural network model, and which uses a person background image and an utterance audio signal as an input to generate a lip sync image, and a lip sync discrimination model, which is an artificial neural network model, and which discriminates the degree of match between the lip sync image generated by the image synthesis model and the utterance audio signal input to the image synthesis model.