17952577. METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM simplified abstract (Robert Bosch GmbH)

From WikiPatents
Jump to navigation Jump to search

METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM

Organization Name

Robert Bosch GmbH

Inventor(s)

Yuheng Wang of Centereach NY (US)

Haibo Ding of Fremont CA (US)

Bingqing Wang of San Jose CA (US)

Zhe Feng of Mountain View CA (US)

METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM - A simplified explanation of the abstract

This abstract first appeared for US patent application 17952577 titled 'METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM

Simplified Explanation

The systems and methods described herein are directed to a Co-Augmentation framework that may learn new rules and labels simultaneously from unlabeled data with a small set of seed rules and a few manually labeled training data. The augmented rules and labels are further used to train supervised neural network models. Specifically, the systems and methods described herein include two major components: a rule augmenter, and a label augmenter. The rule augmenter is directed to learning new rules, which can be used to obtain weak labels from unlabeled data. The label augmenter is directed to learning new labels from unlabeled data. The Co-Augmentation framework is an iterative learning process which generates and refines a high precision set. At each iteration, both the rule augmenter and label augmenter will contribute new and more accurate labels to the high precision set, which is in turn used to train both the rule augmenter and label augmenter.

  • Rule augmenter learns new rules from unlabeled data
  • Label augmenter learns new labels from unlabeled data
  • Co-Augmentation framework iteratively generates and refines a high precision set
  • High precision set is used to train both the rule augmenter and label augmenter

Potential Applications

This technology can be applied in:

  • Natural language processing
  • Data mining
  • Information retrieval

Problems Solved

This technology helps in:

  • Improving the accuracy of supervised neural network models
  • Learning new rules and labels from unlabeled data
  • Enhancing the efficiency of training processes

Benefits

The benefits of this technology include:

  • Increased accuracy in classification tasks
  • Reduced manual labeling efforts
  • Enhanced performance of machine learning models

Potential Commercial Applications

This technology can be commercially be used in:

  • Sentiment analysis tools
  • Recommendation systems
  • Fraud detection algorithms

Possible Prior Art

One possible prior art for this technology could be:

  • Co-training algorithms in machine learning

Unanswered Questions

How does the Co-Augmentation framework handle noisy data?

The abstract does not mention how the system deals with noisy data during the learning process. Noise in the data can affect the accuracy of the rules and labels generated.

Are there any limitations to the size of the seed rules and manually labeled training data?

The abstract does not specify any limitations on the size of the seed rules or training data. It would be important to understand if there are any constraints on the amount of initial data required for the system to function effectively.


Original Abstract Submitted

The systems and methods described herein are directed to a Co-Augmentation framework that may learn new rules and labels simultaneously from unlabeled data with a small set of seed rules and a few manually labeled training data. The augmented rules and labels are further used to train supervised neural network models. Specifically, the systems and methods described herein include two major components: a rule augmenter, and a label augmenter. The rule augmenter is directed to learning new rules, which can be used to obtain weak labels from unlabeled data. The label augmenter is directed to learning new labels from unlabeled data. The Co-Augmentation framework is an iterative learning process which generates and refines a high precision set. At each iteration, both the rule augmenter and label augmenter will contribute new and more accurate labels to the high precision set, which is in turn used to train both the rule augmenter and label augmenter.