17884356. OBJECT IDENTIFICATION IN BIRD'S-EYE VIEW REFERENCE FRAME WITH EXPLICIT DEPTH ESTIMATION CO-TRAINING simplified abstract (WAYMO LLC)

From WikiPatents
Jump to navigation Jump to search

OBJECT IDENTIFICATION IN BIRD'S-EYE VIEW REFERENCE FRAME WITH EXPLICIT DEPTH ESTIMATION CO-TRAINING

Organization Name

WAYMO LLC

Inventor(s)

Albert Zhao of Saratoga CA (US)

Vasiliy Igorevich Karasev of San Francisco CA (US)

Hang Yan of Sunnyvale CA (US)

Daniel Rudolf Maurer of Mountain View CA (US)

Alper Ayvaci of San Jose CA (US)

Yu-Han Chen of Santa Clara CA (US)

OBJECT IDENTIFICATION IN BIRD'S-EYE VIEW REFERENCE FRAME WITH EXPLICIT DEPTH ESTIMATION CO-TRAINING - A simplified explanation of the abstract

This abstract first appeared for US patent application 17884356 titled 'OBJECT IDENTIFICATION IN BIRD'S-EYE VIEW REFERENCE FRAME WITH EXPLICIT DEPTH ESTIMATION CO-TRAINING

Simplified Explanation

The described aspects and implementations enable efficient detection and classification of objects with machine learning models that deploy a bird's-eye view representation and are trained using depth ground truth data. In one implementation, disclosed are system and techniques that include obtaining images, generating, using a first neural network (NN), feature vectors (FVs) and depth distributions pixels of images, wherein the first NN is trained using training images and a depth ground truth data for the training images. The techniques further include obtaining a feature tensor (FT) in view of the FVs and the depth distributions, and processing the obtained FTs, using a second NN, to identify one or more objects depicted in the images.

  • Efficient detection and classification of objects using machine learning models
  • Utilization of bird's-eye view representation and depth ground truth data
  • Training of neural networks using training images and depth information
  • Generation of feature vectors and depth distributions for images
  • Identification of objects in images using a second neural network

Potential Applications

This technology can be applied in various fields such as autonomous driving, surveillance systems, robotics, and augmented reality for accurate object detection and classification.

Problems Solved

1. Improved accuracy in object detection and classification. 2. Efficient processing of images with complex backgrounds.

Benefits

1. Enhanced safety in autonomous vehicles. 2. Better security in surveillance systems. 3. Increased efficiency in robotics applications. 4. Enhanced user experience in augmented reality.

Potential Commercial Applications

Optimizing traffic flow in smart cities using autonomous vehicles.

Possible Prior Art

One potential prior art could be the use of depth information in object detection and classification tasks in computer vision applications.

What are the limitations of this technology in real-world applications?

The abstract does not mention any limitations of the technology in real-world applications. However, potential limitations could include the computational resources required for processing large amounts of image data and the need for high-quality depth ground truth data for training the neural networks.

How does this technology compare to existing object detection and classification methods?

The abstract does not provide a direct comparison to existing object detection and classification methods. However, this technology's use of bird's-eye view representation and depth ground truth data sets it apart from traditional methods, potentially offering improved accuracy and efficiency in detecting and classifying objects.


Original Abstract Submitted

The described aspects and implementations enable efficient detection and classification of objects with machine learning models that deploy a bird's-eye view representation and are trained using depth ground truth data. In one implementation, disclosed are system and techniques that include obtaining images, generating, using a first neural network (NN), feature vectors (FVs) and depth distributions pixels of images, wherein the first NN is trained using training images and a depth ground truth data for the training images. The techniques further include obtaining a feature tensor (FT) in view of the FVs and the depth distributions, and processing the obtained FTs, using a second NN, to identify one or more objects depicted in the images.