18497938. TECHNIQUES FOR TRAINING A MACHINE LEARNING MODEL TO RECONSTRUCT DIFFERENT THREE-DIMENSIONAL SCENES simplified abstract (NVIDIA Corporation)

From WikiPatents
Jump to navigation Jump to search

TECHNIQUES FOR TRAINING A MACHINE LEARNING MODEL TO RECONSTRUCT DIFFERENT THREE-DIMENSIONAL SCENES

Organization Name

NVIDIA Corporation

Inventor(s)

Yang Fu of San Diego CA (US)

Sifei Liu of San Diego CA (US)

Jan Kautz of Lexington MA (US)

Xueting Li of Santa Clara CA (US)

Shalini De Mello of San Francisco CA (US)

Amey Kulkarni of San Jose CA (US)

Milind Naphade of Cupertino CA (US)

TECHNIQUES FOR TRAINING A MACHINE LEARNING MODEL TO RECONSTRUCT DIFFERENT THREE-DIMENSIONAL SCENES - A simplified explanation of the abstract

This abstract first appeared for US patent application 18497938 titled 'TECHNIQUES FOR TRAINING A MACHINE LEARNING MODEL TO RECONSTRUCT DIFFERENT THREE-DIMENSIONAL SCENES

Simplified Explanation

The training application described in the patent application trains a machine learning model to generate three-dimensional representations of two-dimensional images by mapping depth images and viewpoints to signed distance function (SDF) values associated with 3D query points, as well as mapping red, blue, and green (RGB) images to radiance values associated with the 3D query points. The application then computes an RGBD reconstruction loss based on the SDF values and radiance values, and modifies pre-trained geometry encoders and decoders, as well as untrained texture encoders and decoders, based on the RGBD reconstruction loss to generate a trained machine learning model that produces 3D representations of RGBD images.

  • Training application for machine learning model to generate 3D representations of 2D images
  • Mapping of depth images and viewpoints to SDF values for 3D query points
  • Mapping of RGB images to radiance values for 3D query points
  • Computation of RGBD reconstruction loss based on SDF and radiance values
  • Modification of pre-trained geometry encoders and decoders, as well as untrained texture encoders and decoders, based on RGBD reconstruction loss

Potential Applications

This technology could be applied in fields such as virtual reality, augmented reality, computer graphics, and medical imaging.

Problems Solved

This technology solves the problem of efficiently generating accurate 3D representations from 2D images, which can be useful in various industries for visualization and analysis purposes.

Benefits

The benefits of this technology include improved accuracy and efficiency in generating 3D representations, which can lead to better visualization, analysis, and understanding of complex data.

Potential Commercial Applications

Potential commercial applications of this technology include software development for virtual reality experiences, medical imaging software, and tools for architects and designers.

Possible Prior Art

One possible prior art in this field is the use of convolutional neural networks for image-to-image translation, which has been applied in various domains for generating realistic images from input data.

Unanswered Questions

How does this technology compare to existing methods for generating 3D representations from 2D images?

The article does not provide a direct comparison to existing methods in terms of performance, accuracy, or efficiency.

What are the limitations or challenges of implementing this technology in real-world applications?

The article does not address potential limitations or challenges such as computational resources required, data availability, or scalability issues.


Original Abstract Submitted

In various embodiments, a training application trains a machine learning model to generate three-dimensional (3D) representations of two-dimensional images. The training application maps a depth image and a viewpoint to signed distance function (SDF) values associated with 3D query points. The training application maps a red, blue, and green (RGB) image to radiance values associated with the 3DI query points. The training application computes a red, blue, green, and depth (RGBD) reconstruction loss based on at least the SDF values and the radiance values. The training application modifies at least one of a pre-trained geometry encoder, a pre-trained geometry decoder, an untrained texture encoder, or an untrained texture decoder based on the RGBD reconstruction loss to generate a trained machine learning model that generates 3D representations of RGBD images.