20240037788. 3D POSE ESTIMATION IN ROBOTICS simplified abstract (NVIDIA Corporation)

From WikiPatents
Jump to navigation Jump to search

3D POSE ESTIMATION IN ROBOTICS

Organization Name

NVIDIA Corporation

Inventor(s)

Sravya Nimmagadda of Santa Clara CA (US)

David Weikersdorfer of Mountain Vuew CA (US)

3D POSE ESTIMATION IN ROBOTICS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240037788 titled '3D POSE ESTIMATION IN ROBOTICS

Simplified Explanation

The abstract of the patent application describes a method for training an autoencoder to predict 3D pose labels using simulation data. The simulation data is extracted from a simulated environment that represents the deployment environment of a 3D pose estimator. Assets such as 3D models, textures, and deployment parameters are used to mimic the deployment environment. The autoencoder is trained to predict a segmentation image that is invariant to occlusions and exclude areas of the input image corresponding to object appendages. Additionally, a GAN is used to adapt the 3D pose to unlabeled real-world data by predicting whether the output of the 3D pose estimator is generated from real-world or simulated data.

  • An autoencoder is trained to predict 3D pose labels using simulation data.
  • Simulation data is extracted from a simulated environment representing the deployment environment of a 3D pose estimator.
  • Assets like 3D models, textures, and deployment parameters are used to mimic the deployment environment.
  • The autoencoder predicts a segmentation image that is invariant to occlusions.
  • Areas of the input image corresponding to object appendages are excluded from the prediction.
  • A GAN is used to adapt the 3D pose to unlabeled real-world data.
  • The GAN predicts whether the output of the 3D pose estimator is from real-world or simulated data.

Potential Applications

  • Improving the accuracy and robustness of 3D pose estimation in real-world scenarios.
  • Training 3D pose estimators using simulated data before deployment in real-world environments.
  • Enhancing the performance of computer vision systems in object recognition and tracking.

Problems Solved

  • Overcoming the limitations of training 3D pose estimators solely on real-world data.
  • Addressing occlusion challenges in 3D pose estimation.
  • Improving the generalization of 3D pose estimators to different deployment scenarios.

Benefits

  • Increased accuracy and reliability of 3D pose estimation in real-world applications.
  • Cost-effective training of 3D pose estimators using simulated data.
  • Improved adaptability of 3D pose estimators to various deployment conditions.


Original Abstract Submitted

an autoencoder may be trained to predict 3d pose labels using simulation data extracted from a simulated environment, which may be configured to represent an environment in which the 3d pose estimator is to be deployed. assets may be used to mimic the deployment environment such as 3d models or textures and parameters used to define deployment scenarios and/or conditions that the 3d pose estimator will operate under in the environment. the autoencoder may be trained to predict a segmentation image from an input image that is invariant to occlusions. further, the autoencoder may be trained to exclude areas of the input image from the object that correspond to one or more appendages of the object. the 3d pose may be adapted to unlabeled real-world data using a gan, which predicts whether output of the 3d pose estimator was generated from real-world data or simulated data.