Oracle international corporation (20240126756). ONE-HOT ENCODER USING LAZY EVALUATION OF RELATIONAL STATEMENTS simplified abstract

From WikiPatents
Jump to navigation Jump to search

ONE-HOT ENCODER USING LAZY EVALUATION OF RELATIONAL STATEMENTS

Organization Name

oracle international corporation

Inventor(s)

FELIX Schmidt of Baden-Dattwil (CH)

MATTEO Casserini of Zurich (CH)

MILOS Vasic of Zurich (CH)

MARIJA Nikolic of Zurich (CH)

ONE-HOT ENCODER USING LAZY EVALUATION OF RELATIONAL STATEMENTS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240126756 titled 'ONE-HOT ENCODER USING LAZY EVALUATION OF RELATIONAL STATEMENTS

Simplified Explanation

The abstract describes a method and storage media for training and implementing a one-hot encoder. During the training phase, an encoder state is computed by extracting unique categories from a training data set, associating each category with a unique index, and generating a one-hot encoding for each category. The set of relational statements is executed by a query optimization engine, with optimizations implemented as needed. In the encoding phase, categorical features in a second training data set are encoded based on the encoder state to form encoded categorical features.

  • During training, unique categories are extracted from a data set and associated with unique indices.
  • One-hot encodings are generated for each unique category.
  • Relational statements are executed by a query optimization engine, with optimizations applied as needed.
  • Encoded categorical features are generated based on the encoder state during the encoding phase.

Potential Applications

The technology can be applied in various fields such as machine learning, data analysis, and natural language processing.

Problems Solved

This technology solves the problem of efficiently encoding categorical features for machine learning models.

Benefits

The benefits of this technology include improved efficiency in encoding categorical features, leading to better performance in machine learning tasks.

Potential Commercial Applications

The technology can be commercially applied in industries utilizing machine learning models, such as e-commerce, finance, and healthcare.

Possible Prior Art

Prior art may include existing methods for encoding categorical features in machine learning models, such as label encoding and one-hot encoding techniques.

Unanswered Questions

How does the query optimization engine optimize the execution of relational statements?

The abstract mentions that the query optimization engine implements optimizations during the execution of relational statements. It would be interesting to know the specific optimizations used and how they contribute to the efficiency of the encoding process.

What are the specific use cases where this technology can outperform traditional encoding methods?

While the abstract highlights the general process of training and implementing a one-hot encoder, it would be valuable to understand the specific scenarios or datasets where this technology can provide significant advantages over traditional encoding methods.


Original Abstract Submitted

a method and one or more non-transitory storage media are provided to train and implement a one-hot encoder. during a training phase, computation of an encoder state is performed by executing a set of relational statements to extract unique categories in a first training data set, associate each unique category with a unique index, and generate a one-hot encoding for each unique category. the set of relational statements are executed by a query optimization engine. execution of the set of relational statements is postponed until a result of each relational statement is needed, and the query optimization engine implements one or more optimizations when executing the set of relational statements. during an encoding phase, a set of categorical features in a second training data set are encoded based on the encoder state to form a set of encoded categorical features.