18531374. LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS simplified abstract (NVIDIA Corporation)

From WikiPatents
Jump to navigation Jump to search

LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS

Organization Name

NVIDIA Corporation

Inventor(s)

Innfarn Yoo of Fremont CA (US)

Rohit Taneja of Fremont CA (US)

LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18531374 titled 'LEVERAGING MULTIDIMENSIONAL SENSOR DATA FOR COMPUTATIONALLY EFFICIENT OBJECT DETECTION FOR AUTONOMOUS MACHINE APPLICATIONS

Simplified Explanation

The abstract describes a patent application for a system that combines 2D and 3D object detection results using deep neural networks for object classification.

  • The system uses region proposal networks (RPNs) to determine regions of interest (ROIs) in both 2D and 3D space.
  • ROIs are extended into 3D frustums, and a point cloud is filtered to include only points within the frustum.
  • The remaining points are voxelated to create a 3D volume, which is processed by a 3D DNN to generate vectors.
  • These vectors, along with additional vectors from a 2D DNN processing image data, are fed into a classifier network for object classification.

Potential Applications

This technology can be applied in various fields such as autonomous driving, robotics, surveillance systems, and augmented reality for accurate object detection and classification.

Problems Solved

This technology addresses the challenge of accurately classifying objects in both 2D and 3D space by combining the strengths of 2D and 3D deep neural networks.

Benefits

The system provides more robust and accurate object classification by fusing information from both 2D and 3D data sources, leading to improved performance in various applications.

Potential Commercial Applications

The technology can be commercialized in industries such as automotive, security, and entertainment for applications requiring precise object detection and classification.

Possible Prior Art

One possible prior art could be the use of separate 2D and 3D object detection systems that do not integrate the results using deep neural networks.

Unanswered Questions

How does the system handle occluded objects in the scene?

The system may struggle with accurately detecting and classifying objects that are partially or fully occluded by other objects or obstacles in the scene. This could impact the overall performance and reliability of the system in real-world scenarios.

What is the computational complexity of the system?

The abstract does not provide information on the computational resources required to implement this system. Understanding the computational complexity is crucial for assessing the feasibility of deploying this technology in resource-constrained environments.


Original Abstract Submitted

In various examples, a two-dimensional (2D) and three-dimensional (3D) deep neural network (DNN) is implemented to fuse 2D and 3D object detection results for classifying objects. For example, regions of interest (ROIs) and/or bounding shapes corresponding thereto may be determined using one or more region proposal networks (RPNs)—such as an image-based RPN and/or a depth-based RPN. Each ROI may be extended into a frustum in 3D world-space, and a point cloud may be filtered to include only points from within the frustum. The remaining points may be voxelated to generate a volume in 3D world space, and the volume may be applied to a 3D DNN to generate one or more vectors. The one or more vectors, in addition to one or more additional vectors generated using a 2D DNN processing image data, may be applied to a classifier network to generate a classification for an object.