MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF

Organization Name

Inventor(s)

Vijay Sahebgouda Bantanur of Gaithersburg MD (US)

MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240311354 titled 'MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF

Simplified Explanation

The patent application describes a system that automatically detects duplicate data entries by analyzing data entries associated with a user, considering values, times, entity identifiers, and locations.

The system identifies pairs of similar data entries by matching entity identifiers and locations.
It determines candidate duplicate data entries based on proximity in time between similar data entries.
For each candidate duplicate data entry, a feature vector is generated and submitted to a classification model to determine duplicates automatically.

Key Features and Innovation

Automatic detection of duplicate data entries based on values, times, entity identifiers, and locations.
Utilization of a classification model trained on historical dispute entries to identify duplicates.
Generation of feature vectors for candidate duplicate data entries to aid in the classification process.

Potential Applications

This technology can be applied in various fields such as data management, fraud detection, and quality control processes.

Problems Solved

Reduces manual effort in identifying duplicate data entries.
Improves data accuracy and integrity.
Enhances efficiency in data processing tasks.

Benefits

Saves time and resources in data management.
Minimizes errors in data entry and analysis.
Streamlines decision-making processes based on accurate data.

Commercial Applications

Data management software tools for businesses.
Fraud detection systems for financial institutions.
Quality control processes in manufacturing industries.

Questions about the Technology

How does the system determine candidate duplicate data entries?

The system determines candidate duplicate data entries based on the proximity in time between similar data entries.

What is the significance of using a classification model trained on historical dispute entries?

The classification model trained on historical dispute entries helps in automatically determining duplicate data entries accurately.

By using advanced algorithms and historical data, this technology offers a sophisticated solution to the common problem of duplicate data entries, providing businesses with a more efficient and accurate way to manage their data.

Original Abstract Submitted

systems and methods of the present disclosure enable a processor to automatically detect duplicate data entries by receiving data entries associated with a user, where each data entry includes a value, a time, an entity identifier, and a location. pairs of similar data entries are determined by matching the entity identifier and the location pairs data entries. candidate duplicate data entries are determined based on a proximity in time between data entries of the similar data entries. for each candidate duplicate data entry, a feature vector is generated including the entity identifier, location, value and time, and each feature vector is submitted to a duplicate classification model to automatically determine duplicate data entries from the candidate duplicate data entries, the duplicate classification model being trained according to a historical dispute entries.

Capital one services, llc (20240311354). MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF simplified abstract

Contents

MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF

Organization Name

Inventor(s)