18528880. DIVERSITY-AWARE WEIGHTED MAJORITY VOTE CLASSIFIER FOR IMBALANCED DATASETS simplified abstract (NEC Corporation)

From WikiPatents
Jump to navigation Jump to search

DIVERSITY-AWARE WEIGHTED MAJORITY VOTE CLASSIFIER FOR IMBALANCED DATASETS

Organization Name

NEC Corporation

Inventor(s)

Anil Goyal of Heidelberg (DE)

Jihed Khiari of Heidelberg (DE)

DIVERSITY-AWARE WEIGHTED MAJORITY VOTE CLASSIFIER FOR IMBALANCED DATASETS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18528880 titled 'DIVERSITY-AWARE WEIGHTED MAJORITY VOTE CLASSIFIER FOR IMBALANCED DATASETS

Simplified Explanation

The abstract describes a method for binary classification on imbalanced datasets using ensemble learning techniques.

  • Generatively oversampling the imbalanced dataset to create a generated dataset.
  • Generating subsamples from the generated dataset and training base classifiers on each subsample.
  • Learning a weighted majority vote classifier by combining outputs of the base classifiers.
  • Minimizing diversity between base classifiers on positive samples.

Potential Applications

This technology can be applied in various fields such as fraud detection, medical diagnosis, and anomaly detection where imbalanced datasets are common.

Problems Solved

1. Addressing the issue of imbalanced datasets in binary classification tasks. 2. Improving the performance of classifiers on minority class samples.

Benefits

1. Enhanced accuracy in predicting minority class samples. 2. Increased robustness and generalization of the classifier. 3. Efficient utilization of imbalanced data for training.

Potential Commercial Applications

Optimizing marketing campaigns, credit risk assessment, and customer churn prediction are potential commercial applications of this technology.

Possible Prior Art

One possible prior art could be the SMOTE (Synthetic Minority Over-sampling Technique) algorithm, which also focuses on generating synthetic samples for the minority class in imbalanced datasets.

Unanswered Questions

How does this method compare to other ensemble learning techniques in terms of performance and computational efficiency?

This article does not provide a direct comparison with other ensemble learning methods, leaving the reader wondering about the relative advantages of this specific approach.

What are the specific parameters or criteria used to assign weights to the base classifiers in the weighted majority vote classifier?

The abstract does not delve into the details of how the weights are determined for each base classifier, leaving a gap in understanding the inner workings of the algorithm.


Original Abstract Submitted

An ensemble learning based method is for a binary classification on an imbalanced dataset. The imbalanced dataset has a minority class comprising positive samples and a majority class comprising negative samples. The method includes: generatively oversampling the imbalanced dataset by synthetically generating minority class examples, thereby generating a generated dataset; using the generated dataset to generate subsamples, and learning a base classifier on each of the subsamples to determine a plurality of base classifiers; and learning a weighted majority vote classifier by combining outputs of the base classifiers. Each of the base classifiers is assigned a weight in such a way that a diversity between the base classifiers on the positive samples is minimized.