METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM

Organization Name

Inventor(s)

METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM - A simplified explanation of the abstract

This abstract first appeared for US patent application 17952577 titled 'METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM

Simplified Explanation

The systems and methods described herein are directed to a Co-Augmentation framework that may learn new rules and labels simultaneously from unlabeled data with a small set of seed rules and a few manually labeled training data. The augmented rules and labels are further used to train supervised neural network models. Specifically, the systems and methods described herein include two major components: a rule augmenter, and a label augmenter. The rule augmenter is directed to learning new rules, which can be used to obtain weak labels from unlabeled data. The label augmenter is directed to learning new labels from unlabeled data. The Co-Augmentation framework is an iterative learning process which generates and refines a high precision set. At each iteration, both the rule augmenter and label augmenter will contribute new and more accurate labels to the high precision set, which is in turn used to train both the rule augmenter and label augmenter.

Rule augmenter learns new rules from unlabeled data
Label augmenter learns new labels from unlabeled data
Co-Augmentation framework iteratively generates and refines a high precision set
High precision set is used to train both the rule augmenter and label augmenter

Potential Applications

This technology can be applied in:

Natural language processing
Data mining
Information retrieval

Problems Solved

This technology helps in:

Improving the accuracy of supervised neural network models
Learning new rules and labels from unlabeled data
Enhancing the efficiency of training processes

Benefits

The benefits of this technology include:

Increased accuracy in classification tasks
Reduced manual labeling efforts
Enhanced performance of machine learning models

Potential Commercial Applications

This technology can be commercially be used in:

Sentiment analysis tools
Recommendation systems
Fraud detection algorithms

Possible Prior Art

One possible prior art for this technology could be:

Co-training algorithms in machine learning

Unanswered Questions

How does the Co-Augmentation framework handle noisy data?

The abstract does not mention how the system deals with noisy data during the learning process. Noise in the data can affect the accuracy of the rules and labels generated.

Are there any limitations to the size of the seed rules and manually labeled training data?

The abstract does not specify any limitations on the size of the seed rules or training data. It would be important to understand if there are any constraints on the amount of initial data required for the system to function effectively.

Original Abstract Submitted

The systems and methods described herein are directed to a Co-Augmentation framework that may learn new rules and labels simultaneously from unlabeled data with a small set of seed rules and a few manually labeled training data. The augmented rules and labels are further used to train supervised neural network models. Specifically, the systems and methods described herein include two major components: a rule augmenter, and a label augmenter. The rule augmenter is directed to learning new rules, which can be used to obtain weak labels from unlabeled data. The label augmenter is directed to learning new labels from unlabeled data. The Co-Augmentation framework is an iterative learning process which generates and refines a high precision set. At each iteration, both the rule augmenter and label augmenter will contribute new and more accurate labels to the high precision set, which is in turn used to train both the rule augmenter and label augmenter.

17952577. METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM simplified abstract (Robert Bosch GmbH)

Contents

METHODS AND SYSTEMS FOR AUTOMATED CREATION OF ANNOTATED DATA AND TRAINING OF A MACHINE LEARNING MODEL THEREFROM

Organization Name

Inventor(s)