International business machines corporation (20240104830). AUGMENTING DATA USED TO TRAIN COMPUTER VISION MODEL WITH IMAGES OF DIFFERENT PERSPECTIVES simplified abstract

From WikiPatents
Jump to navigation Jump to search

AUGMENTING DATA USED TO TRAIN COMPUTER VISION MODEL WITH IMAGES OF DIFFERENT PERSPECTIVES

Organization Name

international business machines corporation

Inventor(s)

Kun Yan Yin of Ningbo (CN)

Xue Ping Liu of Beijing (CN)

Yun Jing Zhao of Beijing (CN)

Fei Wang of Dalian (CN)

Yu Tao Wu of Changshu (CN)

Yue Liu of Ningbo (CN)

AUGMENTING DATA USED TO TRAIN COMPUTER VISION MODEL WITH IMAGES OF DIFFERENT PERSPECTIVES - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240104830 titled 'AUGMENTING DATA USED TO TRAIN COMPUTER VISION MODEL WITH IMAGES OF DIFFERENT PERSPECTIVES

Simplified Explanation

The abstract describes a method, system, and computer program product for improving the accuracy of a vision model by generating a three-dimensional model of an object using images from a dataset, obtaining images of the object from different perspectives, and augmenting the dataset used to train the vision model with these new images.

  • Three-dimensional model generation: Images of an object from a dataset are used to create a three-dimensional model of the object.
  • Obtaining images from different perspectives: Images of the object from a second set of perspectives are obtained, which may include perspectives not present in the original dataset.
  • Dataset augmentation: The dataset used to train the vision model is augmented with the new images of the object from different perspectives.

Potential Applications

This technology could be applied in various fields such as computer vision, object recognition, and augmented reality.

Problems Solved

This technology addresses the issue of limited perspectives in training datasets, which can lead to reduced accuracy in vision models.

Benefits

The benefits of this technology include improved accuracy of vision models, better object recognition, and enhanced performance in applications such as autonomous vehicles and robotics.

Potential Commercial Applications

Potential commercial applications of this technology include image recognition software, surveillance systems, and medical imaging technology.

Possible Prior Art

One possible prior art could be the use of data augmentation techniques in machine learning to improve model performance.

Unanswered Questions

1. How does the system handle variations in lighting conditions when obtaining images from different perspectives? 2. What is the computational cost associated with generating three-dimensional models of objects from images in the dataset?


Original Abstract Submitted

a computer-implemented method, system and computer program product for improving accuracy of a vision model. images of an object with a first set of perspectives are received from a dataset used to train the vision model. a three-dimensional model of the object is then generated using the images of the object from the dataset. using the three-dimensional model of the object, images of the object with a second set of perspectives are obtained. for example, the second set of perspectives may include different perspectives than the perspectives of the object from the images contained in the dataset. the dataset used to train the vision model may then be augmented with such images of the object with a second set of perspectives. in this manner, the dataset used to train the vision model includes a greater number of perspectives of the object thereby improving the accuracy of the vision model.