18066327. UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS simplified abstract (International Business Machines Corporation)

From WikiPatents
Jump to navigation Jump to search

UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS

Organization Name

International Business Machines Corporation

Inventor(s)

Thanh Lam Hoang of Maynooth (IE)

Gabriele Picco of Dublin (IE)

Lam Minh Nguyen of Ossining NY (US)

Dzung Tien Phan of PLEASANTVILLE NY (US)

UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18066327 titled 'UNSUPERVISED LEARNING FROM PUBLIC TABULAR DATASETS

Simplified Explanation

The patent application describes a method, computer program product, and system for feature engineering and synthetic data generation using a variational auto-encoder (VAE) model.

  • A processor retrieves heterogeneous data tables.
  • The processor trains a VAE model on the data tables.
  • An input data table is received.
  • A synthetic data table is generated based on the input data table and the trained VAE model.

Key Features and Innovation

  • Utilizes a VAE model for synthetic data generation.
  • Handles heterogeneous data tables.
  • Enhances feature engineering capabilities.
  • Enables the creation of synthetic data based on existing data.

Potential Applications

This technology can be applied in various fields such as:

  • Data analysis
  • Machine learning
  • Predictive modeling
  • Anomaly detection

Problems Solved

  • Simplifies feature engineering processes.
  • Facilitates the generation of synthetic data for training models.
  • Handles diverse data formats and content effectively.

Benefits

  • Improves data analysis accuracy.
  • Enhances machine learning model training.
  • Saves time and resources in data preprocessing.

Commercial Applications

Title: Advanced Data Generation and Feature Engineering Technology This technology can be utilized in industries such as:

  • Finance for risk assessment models
  • Healthcare for predictive analytics
  • E-commerce for personalized recommendations
  • Marketing for customer segmentation

Prior Art

Further research can be conducted in the field of VAE models and synthetic data generation techniques to explore existing technologies and advancements.

Frequently Updated Research

Stay updated on advancements in VAE models, feature engineering, and synthetic data generation techniques to leverage the latest innovations in the field.

Questions about Feature Engineering and Synthetic Data Generation

How does the VAE model improve synthetic data generation processes?

The VAE model learns the underlying structure of the data and generates new data points that closely resemble the original dataset, enhancing the quality of synthetic data.

What are the potential challenges in implementing feature engineering using heterogeneous data tables?

Integrating diverse data formats and content from multiple sources may require careful preprocessing and normalization to ensure accurate feature engineering results.


Original Abstract Submitted

A method, computer program product and system are provided for feature engineering and synthetic data generation. A processor retrieves a plurality of data tables, where the plurality of data tables are heterogeneous in format and content. A processor trains a variational auto-encoder (VAE) model on the plurality of data tables. A processor receives an input data table. A processor generates a synthetic data table based on the input data table and the trained VAE model.