18395198. POINT-LEVEL SUPERVISION FOR VIDEO INSTANCE SEGMENTATION simplified abstract (NVIDIA Corporation)

From WikiPatents
Jump to navigation Jump to search

POINT-LEVEL SUPERVISION FOR VIDEO INSTANCE SEGMENTATION

Organization Name

NVIDIA Corporation

Inventor(s)

Zhiding Yu of Cupertino CA (US)

Shuaiyi Huang of Greenbelt MD (US)

De-An Huang of Cupertino CA (US)

Shiyi Lan of Sunnyvale CA (US)

Subhashree Radhakrishnan of Milpitas CA (US)

Jose M. Alvarez Lopez of Mountain View CA (US)

Anima Anandkumar of Pasadena CA (US)

POINT-LEVEL SUPERVISION FOR VIDEO INSTANCE SEGMENTATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 18395198 titled 'POINT-LEVEL SUPERVISION FOR VIDEO INSTANCE SEGMENTATION

Abstract: Video instance segmentation is a computer vision task that aims to detect, segment, and track objects continuously in videos. It can be used in numerous real-world applications, such as video editing, three-dimensional (3D) reconstruction, 3D navigation (e.g. for autonomous driving and/or robotics), and view point estimation. However, current machine learning-based processes employed for video instance segmentation are lacking, particularly because the densely annotated videos needed for supervised training of high-quality models are not readily available and are not easily generated. To address the issues in the prior art, the present disclosure provides point-level supervision for video instance segmentation in a manner that allows the resulting machine learning model to handle any object category.

Key Features and Innovation:

  • Video instance segmentation for detecting, segmenting, and tracking objects in videos.
  • Point-level supervision provided for training high-quality models.
  • Ability to handle any object category in the resulting machine learning model.

Potential Applications:

  • Video editing
  • 3D reconstruction
  • 3D navigation for autonomous driving and robotics
  • View point estimation

Problems Solved:

  • Lack of densely annotated videos for supervised training
  • Difficulty in generating high-quality models for video instance segmentation

Benefits:

  • Improved accuracy in detecting and tracking objects in videos
  • Enhanced performance in real-world applications such as autonomous driving and robotics
  • Flexibility to handle various object categories

Commercial Applications: Commercial applications of this technology could include video editing software, autonomous driving systems, robotics applications, and virtual reality development tools.

Prior Art: Researchers and developers can explore prior art related to video instance segmentation, machine learning-based processes, and object detection in videos to understand the existing technology landscape.

Frequently Updated Research: Stay updated on advancements in video instance segmentation, machine learning algorithms for object detection, and applications of computer vision in real-world scenarios.

Questions about Video Instance Segmentation: 1. How does point-level supervision improve the training of machine learning models for video instance segmentation? Point-level supervision allows for more precise annotations, leading to higher-quality models that can handle any object category effectively.

2. What are the challenges faced in obtaining densely annotated videos for supervised training in video instance segmentation? The challenges include the lack of readily available annotated videos and the difficulty in generating high-quality annotations for training models.


Original Abstract Submitted

Video instance segmentation is a computer vision task that aims to detect, segment, and track objects continuously in videos. It can be used in numerous real-world applications, such as video editing, three-dimensional (3D) reconstruction, 3D navigation (e.g. for autonomous driving and/or robotics), and view point estimation. However, current machine learning-based processes employed for video instance segmentation are lacking, particularly because the densely annotated videos needed for supervised training of high-quality models are not readily available and are not easily generated. To address the issues in the prior art, the present disclosure provides point-level supervision for video instance segmentation in a manner that allows the resulting machine learning model to handle any object category.