Nvidia corporation (20240112036). LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS simplified abstract

From WikiPatents
Jump to navigation Jump to search

LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS

Organization Name

nvidia corporation

Inventor(s)

Innfarn Yoo of Fremont CA (US)

Rohit Taneja of Fremont CA (US)

LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240112036 titled 'LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS

Simplified Explanation

The patent application describes a method of combining 2D and 3D object detection results using deep neural networks for object classification.

  • Region of interest (ROI) and bounding shapes are determined using region proposal networks (RPNs) such as image-based RPN and depth-based RPN.
  • Each ROI is extended into a frustum in 3D world-space, and a point cloud is filtered to include only points within the frustum.
  • The remaining points are voxelated to generate a volume in 3D world space, which is then processed by a 3D DNN to generate vectors.
  • The vectors from the 3D DNN and additional vectors from a 2D DNN processing image data are combined and applied to a classifier network to classify objects.

Potential Applications

This technology can be applied in autonomous driving systems, robotics, augmented reality, and surveillance systems.

Problems Solved

This technology addresses the challenge of accurately classifying objects in complex environments where both 2D and 3D information is crucial for detection.

Benefits

The fusion of 2D and 3D object detection results improves the accuracy and reliability of object classification, leading to enhanced performance in various applications.

Potential Commercial Applications

Commercial applications include autonomous vehicles, security systems, industrial automation, and medical imaging technologies.

Possible Prior Art

Prior art may include research papers or patents related to object detection using deep neural networks, 2D and 3D fusion techniques, and image processing algorithms.

Unanswered Questions

How does this technology compare to existing methods of object classification using only 2D or 3D information?

This technology combines the strengths of both 2D and 3D object detection methods to improve classification accuracy in complex environments.

What are the computational requirements for implementing this technology in real-time systems?

The patent application does not provide details on the computational resources needed to deploy this technology in real-time applications.


Original Abstract Submitted

in various examples, a two-dimensional (2d) and three-dimensional (3d) deep neural network (dnn) is implemented to fuse 2d and 3d object detection results for classifying objects. for example, regions of interest (rois) and/or bounding shapes corresponding thereto may be determined using one or more region proposal networks (rpns)—such as an image-based rpn and/or a depth-based rpn. each roi may be extended into a frustum in 3d world-space, and a point cloud may be filtered to include only points from within the frustum. the remaining points may be voxelated to generate a volume in 3d world space, and the volume may be applied to a 3d dnn to generate one or more vectors. the one or more vectors, in addition to one or more additional vectors generated using a 2d dnn processing image data, may be applied to a classifier network to generate a classification for an object.