18471669. SYSTEMS AND METHODS FOR EXPANDING DATA CLASSIFICATION USING SYNTHETIC DATA GENERATION IN MACHINE LEARNING MODELS simplified abstract (Capital One Services, LLC)

From WikiPatents
Jump to navigation Jump to search

SYSTEMS AND METHODS FOR EXPANDING DATA CLASSIFICATION USING SYNTHETIC DATA GENERATION IN MACHINE LEARNING MODELS

Organization Name

Capital One Services, LLC

Inventor(s)

Austin Walters of Savoy IL (US)

Jeremy Goodsitt of Champaign IL (US)

Anh Truong of Champaign IL (US)

SYSTEMS AND METHODS FOR EXPANDING DATA CLASSIFICATION USING SYNTHETIC DATA GENERATION IN MACHINE LEARNING MODELS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18471669 titled 'SYSTEMS AND METHODS FOR EXPANDING DATA CLASSIFICATION USING SYNTHETIC DATA GENERATION IN MACHINE LEARNING MODELS

Simplified Explanation

The abstract describes a patent application for systems and methods for classifying data, including training a data classification model, generating synthetic data for additional classes, and retraining the model using the synthetic data.

  • Receiving training data with a class
  • Training a data classification model with the training data
  • Receiving additional data with labeled samples of an additional class
  • Creating a synthetic data generator
  • Training the synthetic data generator to generate synthetic data for the additional class
  • Generating a synthetic classified dataset with the additional class
  • Retraining the data classification model using the synthetic classified dataset

Potential Applications

The technology can be applied in various fields such as image recognition, natural language processing, and fraud detection.

Problems Solved

This technology helps in improving the accuracy and efficiency of data classification models by generating synthetic data for classes not present in the original training data.

Benefits

The benefits of this technology include enhanced classification accuracy, increased model robustness, and the ability to adapt to new classes without the need for additional labeled data.

Potential Commercial Applications

Potential commercial applications of this technology include e-commerce product recommendation systems, healthcare diagnosis tools, and financial risk assessment platforms.

Possible Prior Art

One possible prior art for this technology could be the use of data augmentation techniques in machine learning to improve model performance.

What are the specific industries that could benefit from this technology?

Specific industries that could benefit from this technology include healthcare, finance, e-commerce, cybersecurity, and marketing.

How does this technology compare to existing methods of data classification?

This technology improves upon existing methods of data classification by generating synthetic data for new classes, thereby enhancing the model's ability to classify a wider range of data accurately.


Original Abstract Submitted

Systems and methods for classifying data are disclosed. For example, a system may include at least one memory storing instructions and at least one processor configured to execute the instructions to perform operations. The operations may include receiving training data comprising a class. The operations may include training a data classification model using the training data to generate a trained data classification model. The operations may include receiving additional data comprising labeled samples of an additional class not contained in the training data. The operations may include creating a synthetic data generator. The operations may include training the synthetic data generator to generate synthetic data corresponding to the additional class. The operations may include generating a synthetic classified dataset comprising the additional class. The operations may include retraining the trained data classification model using the synthetic classified dataset.