US Patent Application 17739716. SYSTEM AND METHOD FOR EFFICIENT TRANSFORMATION PREDICTION IN A DATA ANALYTICS PREDICTION MODEL PIPELINE simplified abstract
SYSTEM AND METHOD FOR EFFICIENT TRANSFORMATION PREDICTION IN A DATA ANALYTICS PREDICTION MODEL PIPELINE
Organization Name
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor(s)
SYSTEM AND METHOD FOR EFFICIENT TRANSFORMATION PREDICTION IN A DATA ANALYTICS PREDICTION MODEL PIPELINE - A simplified explanation of the abstract
This abstract first appeared for US patent application 17739716 titled 'SYSTEM AND METHOD FOR EFFICIENT TRANSFORMATION PREDICTION IN A DATA ANALYTICS PREDICTION MODEL PIPELINE
Simplified Explanation
The patent application describes a computer system or program that improves the selection of transformations in an ensemble machine learning model.
- The system provides all the base machine learning models in the ensemble model.
- It identifies and analyzes derived fields in these models.
- It computes the importance weights for both the derived fields and the models themselves.
- The system clusters the derived fields based on their importance weights.
- It sorts the clusters to find the best one based on the importance weights.
- Finally, it runs the base machine learning models using the derived fields in the best cluster.
Original Abstract Submitted
A computer-implemented system, platform, programing product, and/or method for improving transformation selection in an ensemble machine learning (ML) model that includes: providing all base ML models of the ensemble ML model; identifying all of a plurality of Derived Fields in all the base ML models; performing a Derived Field run prediction analysis for all the Derived Fields; computing the Derived Field Importance Weight for Field (DFIW4F) and the Derived Field Importance Weight for Model (DFIW4M) for all the Derived Fields; clustering all the Derived Fields into a plurality of Derived Field clusters, wherein each Derived Field cluster is based upon the DFIW4M and the DFIW4F for the Derived Field; sorting all the Derived Field clusters by best cluster based upon DFIW4M and DFIW4F; and running the base ML models based upon the Derived Fields in the best Derived Field cluster until sufficient base ML models have been run.