18211149. AUTODECODING LATENT 3D DIFFUSION MODELS (Snap Inc.)
Contents
AUTODECODING LATENT 3D DIFFUSION MODELS
Organization Name
Inventor(s)
Evangelos Ntavelis of Los Angeles CA (US)
Kyle Olszewski of Los Angeles CA (US)
Aliaksandr Siarohin of Los Angeles CA (US)
Sergey Tulyakov of Santa Monica CA (US)
AUTODECODING LATENT 3D DIFFUSION MODELS
This abstract first appeared for US patent application 18211149 titled 'AUTODECODING LATENT 3D DIFFUSION MODELS
Original Abstract Submitted
Systems and methods for generating static and articulated 3D assets are provided that include a 3D autodecoder at their core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space, which can then be decoded into a volumetric representation for rendering view-consistent appearance and geometry. The appropriate intermediate volumetric latent space is then identified and robust normalization and de-normalization operations are implemented to learn a 3D diffusion from 2D images or monocular videos of rigid or articulated objects. The methods are flexible enough to use either existing camera supervision or no camera information at all—instead efficiently learning the camera information during training. The generated results are shown to outperform state-of-the-art alternatives on various benchmark datasets and metrics, including multi-view image datasets of synthetic objects, real in-the-wild videos of moving people, and a large-scale, real video dataset of static objects.