18055310. GENERATING GESTURE REENACTMENT VIDEO FROM VIDEO MOTION GRAPHS USING MACHINE LEARNING simplified abstract (ADOBE INC.)

From WikiPatents
Jump to navigation Jump to search

GENERATING GESTURE REENACTMENT VIDEO FROM VIDEO MOTION GRAPHS USING MACHINE LEARNING

Organization Name

ADOBE INC.

Inventor(s)

Yang Zhou of Sunnyvale CA (US)

Jimei Yang of Mountain View CA (US)

Jun Saito of Seattle WA (US)

Dingzeyu Li of Seattle WA (US)

Deepali Aneja of Seattle WA (US)

GENERATING GESTURE REENACTMENT VIDEO FROM VIDEO MOTION GRAPHS USING MACHINE LEARNING - A simplified explanation of the abstract

This abstract first appeared for US patent application 18055310 titled 'GENERATING GESTURE REENACTMENT VIDEO FROM VIDEO MOTION GRAPHS USING MACHINE LEARNING

Simplified Explanation

The patent application describes a method for generating a gesture reenactment video sequence corresponding to a target audio sequence using a trained network based on a video motion graph generated from a reference speech video.

  • Simplified Explanation:
 - The technology uses a reference speech video to create a video motion graph.
 - It then generates a gesture reenactment video sequence based on a target audio sequence by identifying a node path through the video motion graph.
  • Potential Applications:
 - Virtual reality applications
 - Dubbing and voiceover industries
  • Problems Solved:
 - Matching gestures with audio in videos
 - Automating the process of creating gesture reenactment videos
  • Benefits:
 - Improved synchronization between audio and video
 - Time-saving in video production processes
  • Potential Commercial Applications:
 - Video editing software
 - Animation studios
  • Possible Prior Art:
 - Prior art in gesture recognition technology
 - Prior art in video editing software

Questions:

1. How does the trained network identify the node path through the video motion graph?

  - The trained network compares target audio features with reference audio features to determine the node path.

2. What are the specific features of the video motion graph that are used in generating the output media sequence?

  - The video motion graph includes nodes associated with frames of the reference video sequence and reference audio features, which are used to create the output media sequence.


Original Abstract Submitted

Embodiments are disclosed for generating a gesture reenactment video sequence corresponding to a target audio sequence using a trained network based on a video motion graph generated from a reference speech video. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a first input including a reference speech video and generating a video motion graph representing the reference speech video, where each node is associated with a frame of the reference video sequence and reference audio features of the reference audio sequence. The disclosed systems and methods further comprise receiving a second input including a target audio sequence, generating target audio features, identifying a node path through the video motion graph based on the target audio features and the reference audio features, and generating an output media sequence based on the identified node path through the video motion graph paired with the target audio sequence.