18430829. SYSTEMS AND METHODS FOR DATA STREAM USING SYNTHETIC DATA GENERATION simplified abstract (Capital One Services, LLC)
Contents
- 1 SYSTEMS AND METHODS FOR DATA STREAM USING SYNTHETIC DATA GENERATION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 SYSTEMS AND METHODS FOR DATA STREAM USING SYNTHETIC DATA GENERATION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
SYSTEMS AND METHODS FOR DATA STREAM USING SYNTHETIC DATA GENERATION
Organization Name
Inventor(s)
Anh Truong of Champaign IL (US)
Jeremy Goodsitt of Champaign IL (US)
Austin Walters of Savoy IL (US)
SYSTEMS AND METHODS FOR DATA STREAM USING SYNTHETIC DATA GENERATION - A simplified explanation of the abstract
This abstract first appeared for US patent application 18430829 titled 'SYSTEMS AND METHODS FOR DATA STREAM USING SYNTHETIC DATA GENERATION
Simplified Explanation
The patent application describes systems and methods for generating synthetic data using machine learning techniques in real-time. The system receives a continuous data stream, processes it, creates bins without overlapping, determines the number of samples in each bin based on bin edges, and populates the dataset with synthetic data.
- Receiving continuous data stream and processing it in real-time
- Using machine learning techniques to generate synthetic data
- Creating non-overlapping bins within a data range
- Determining number of samples in each bin based on bin edges
Potential Applications
The technology can be applied in various fields such as finance, healthcare, and marketing for generating synthetic data for training machine learning models, testing algorithms, and conducting simulations.
Problems Solved
1. Generating synthetic data efficiently and in real-time 2. Creating non-overlapping bins to organize data effectively
Benefits
1. Improved data processing speed 2. Enhanced accuracy in generating synthetic data 3. Efficient organization of data in bins
Potential Commercial Applications
Optimizing marketing campaigns, improving healthcare data analysis, enhancing financial risk assessment models
Possible Prior Art
One possible prior art could be the use of traditional data generation techniques that may not be as efficient or real-time as the system described in the patent application.
Unanswered Questions
How does the system handle outliers in the continuous data stream?
The patent application does not mention how outliers in the continuous data stream are handled during the synthetic data generation process. This aspect is crucial as outliers can significantly impact the accuracy of the generated synthetic data.
What is the scalability of the system for handling large datasets?
The scalability of the system for processing and generating synthetic data from large datasets is not addressed in the patent application. Understanding the system's scalability is essential for determining its practical applications in real-world scenarios with extensive data requirements.
Original Abstract Submitted
Systems and methods for synthetic data generation. A system includes at least one processor and a storage medium storing instructions that, when executed by the one or more processors, cause the at least one processor to perform operations including receiving a continuous data stream from an outside source, processing the continuous data stream in real-time, and using machine learning techniques to generating synthetic data to populate the dataset. The operations also include creating a plurality of bins, wherein the plurality of bins occupy a data range between the determined minimum and maximum values without overlapping; and determining a number of samples within each of the created bin, based on a bin edges, wherein the bin edges are bounds within the data range.