Photorealistic Talking Faces from Audio

Organization Name

Google LLC

Inventor(s)

Vivek Kwatra of Saratoga CA (US)

Christian Frueh of Mountain View CA (US)

Avisek Lahiri of West Bengal (IN)

John Lewis of Mountain View CA (US)

Photorealistic Talking Faces from Audio - A simplified explanation of the abstract

This abstract for appeared for US patent application number 17796399 Titled 'Photorealistic Talking Faces from Audio'

Simplified Explanation

This abstract describes a framework for creating realistic 3D talking faces based solely on audio input. The framework includes methods for inserting these generated faces into existing videos or virtual environments. The process involves breaking down the faces from video into separate components such as 3D geometry, head pose, and texture. This allows for separate analysis and prediction of the face shape and texture. To ensure smooth transitions, an auto-regressive approach is used that takes into account the previous visual state of the model. Additionally, the model incorporates face illumination using audio-independent 3D texture normalization.

Original Abstract Submitted

Provided is a framework for generating photorealistic 3D talking faces conditioned only on audio input. In addition, the present disclosure provides associated methods to insert generated faces into existing videos or virtual environments. We decompose faces from video into a normalized space that decouples 3D geometry, head pose, and texture. This allows separating the prediction problem into regressions over the 3D face shape and the corresponding 2D texture atlas. To stabilize temporal dynamics, we propose an auto-regressive approach that conditions the model on its previous visual state. We also capture face illumination in our model using audio-independent 3D texture normalization.

US Patent Application 17796399. Photorealistic Talking Faces from Audio simplified abstract

Contents

Photorealistic Talking Faces from Audio

Organization Name

Inventor(s)

Photorealistic Talking Faces from Audio - A simplified explanation of the abstract

Simplified Explanation

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools