18055632. SYSTEMS AND METHODS FOR GENERATING SYNTHETIC TRAINING DATASETS FOR TRAINING MACHINE LEARNING MODELS IN TRAINING DATA-SPARSE ENVIRONMENTS FOR NON-HOMOGENOUS PREDICTIONS simplified abstract (Capital One Services, LLC)

From WikiPatents
Jump to navigation Jump to search

SYSTEMS AND METHODS FOR GENERATING SYNTHETIC TRAINING DATASETS FOR TRAINING MACHINE LEARNING MODELS IN TRAINING DATA-SPARSE ENVIRONMENTS FOR NON-HOMOGENOUS PREDICTIONS

Organization Name

Capital One Services, LLC

Inventor(s)

Tyler Maiman of Melville NY (US)

Kevin Osborn of Newton Highlands MA (US)

Shabnam Kousha of Washington DC (US)

SYSTEMS AND METHODS FOR GENERATING SYNTHETIC TRAINING DATASETS FOR TRAINING MACHINE LEARNING MODELS IN TRAINING DATA-SPARSE ENVIRONMENTS FOR NON-HOMOGENOUS PREDICTIONS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18055632 titled 'SYSTEMS AND METHODS FOR GENERATING SYNTHETIC TRAINING DATASETS FOR TRAINING MACHINE LEARNING MODELS IN TRAINING DATA-SPARSE ENVIRONMENTS FOR NON-HOMOGENOUS PREDICTIONS

Simplified Explanation

The patent application describes a method for generating synthetic training datasets for training machine learning models in data-sparse environments for non-homogenous predictions.

  • User-specific information is received and labeled.
  • Characteristics are determined based on the labeled user-specific information.
  • Alternative actions corresponding to the characteristics are determined.
  • A machine learning model is trained based on the synthetic training data to generate predictions in response to user actions.

Potential Applications

This technology could be applied in personalized recommendation systems, targeted advertising, and user behavior prediction in various industries such as e-commerce, social media, and online services.

Problems Solved

1. Addressing data scarcity in training machine learning models for non-homogenous predictions. 2. Improving prediction accuracy by generating synthetic training data based on user-specific information.

Benefits

1. Enhanced personalization of user experiences. 2. Improved efficiency in training machine learning models. 3. Increased accuracy in predicting user behavior.

Potential Commercial Applications

Optimizing marketing strategies, enhancing user engagement, and improving customer satisfaction in industries such as retail, entertainment, and digital marketing.

Possible Prior Art

One possible prior art could be the use of data augmentation techniques in machine learning to generate synthetic training data for improving model performance in data-sparse environments.

Unanswered Questions

How does the system handle privacy concerns related to user-specific information?

The patent application does not provide details on how user privacy is maintained while utilizing user-specific information for generating synthetic training data.

What types of machine learning models are compatible with this method?

The patent application does not specify the types of machine learning models that can be trained using the synthetic training data generated through this method.


Original Abstract Submitted

In some embodiments, generating synthetic training datasets for training machine learning models in training data-sparse environments for non-homogenous predictions may be facilitated. In some embodiments, user-specific information associated with a user may be received. The system may generate synthetic training data representing one or more alternative actions corresponding to one or more characteristics by: labeling the user-specific information, determining (e.g., based on the labeled user-specific information) one or more characteristics of the labeled user-specific information, and determining (e.g., based on the one or more characteristics) one or more alternative actions corresponding to the one or more characteristics. The system may then train a machine learning model based on the synthetic training data to generate a prediction in response to providing an action of a first user to the machine learning model.