18066327. UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS simplified abstract (International Business Machines Corporation)
Contents
- 1 UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Key Features and Innovation
- 1.6 Potential Applications
- 1.7 Problems Solved
- 1.8 Benefits
- 1.9 Commercial Applications
- 1.10 Prior Art
- 1.11 Frequently Updated Research
- 1.12 Questions about Feature Engineering and Synthetic Data Generation
- 1.13 Original Abstract Submitted
UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS
Organization Name
International Business Machines Corporation
Inventor(s)
Thanh Lam Hoang of Maynooth (IE)
Lam Minh Nguyen of Ossining NY (US)
Dzung Tien Phan of PLEASANTVILLE NY (US)
UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS - A simplified explanation of the abstract
This abstract first appeared for US patent application 18066327 titled 'UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS
Simplified Explanation
The patent application describes a method, computer program product, and system for feature engineering and synthetic data generation using a variational auto-encoder (VAE) model.
- A processor retrieves heterogeneous data tables.
- The processor trains a VAE model on the data tables.
- An input data table is received.
- A synthetic data table is generated based on the input data table and the trained VAE model.
Key Features and Innovation
- Utilizes a VAE model for synthetic data generation.
- Handles heterogeneous data tables.
- Enhances feature engineering capabilities.
- Enables the creation of synthetic data based on existing data.
Potential Applications
This technology can be applied in various fields such as:
- Data analysis
- Machine learning
- Predictive modeling
- Anomaly detection
Problems Solved
- Simplifies feature engineering processes.
- Facilitates the generation of synthetic data for training models.
- Handles diverse data formats and content effectively.
Benefits
- Improves data analysis accuracy.
- Enhances machine learning model training.
- Saves time and resources in data preprocessing.
Commercial Applications
Title: Advanced Data Generation and Feature Engineering Technology This technology can be utilized in industries such as:
- Finance for risk assessment models
- Healthcare for predictive analytics
- E-commerce for personalized recommendations
- Marketing for customer segmentation
Prior Art
Further research can be conducted in the field of VAE models and synthetic data generation techniques to explore existing technologies and advancements.
Frequently Updated Research
Stay updated on advancements in VAE models, feature engineering, and synthetic data generation techniques to leverage the latest innovations in the field.
Questions about Feature Engineering and Synthetic Data Generation
How does the VAE model improve synthetic data generation processes?
The VAE model learns the underlying structure of the data and generates new data points that closely resemble the original dataset, enhancing the quality of synthetic data.
What are the potential challenges in implementing feature engineering using heterogeneous data tables?
Integrating diverse data formats and content from multiple sources may require careful preprocessing and normalization to ensure accurate feature engineering results.
Original Abstract Submitted
A method, computer program product and system are provided for feature engineering and synthetic data generation. A processor retrieves a plurality of data tables, where the plurality of data tables are heterogeneous in format and content. A processor trains a variational auto-encoder (VAE) model on the plurality of data tables. A processor receives an input data table. A processor generates a synthetic data table based on the input data table and the trained VAE model.