Capital one services, llc (20240311354). MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF simplified abstract
Contents
- 1 MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Key Features and Innovation
- 1.6 Potential Applications
- 1.7 Problems Solved
- 1.8 Benefits
- 1.9 Commercial Applications
- 1.10 Questions about the Technology
- 1.11 Original Abstract Submitted
MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF
Organization Name
Inventor(s)
Srinivasarao Daruna of Ashburn VA (US)
Vijay Sahebgouda Bantanur of Gaithersburg MD (US)
Marisa Lee of Washington DC (US)
MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240311354 titled 'MACHINE-LEARNING BASED DATA ENTRY DUPLICATION DETECTION AND MITIGATION AND METHODS THEREOF
Simplified Explanation
The patent application describes a system that automatically detects duplicate data entries by analyzing data entries associated with a user, considering values, times, entity identifiers, and locations.
- The system identifies pairs of similar data entries by matching entity identifiers and locations.
- It determines candidate duplicate data entries based on proximity in time between similar data entries.
- For each candidate duplicate data entry, a feature vector is generated and submitted to a classification model to determine duplicates automatically.
Key Features and Innovation
- Automatic detection of duplicate data entries based on values, times, entity identifiers, and locations.
- Utilization of a classification model trained on historical dispute entries to identify duplicates.
- Generation of feature vectors for candidate duplicate data entries to aid in the classification process.
Potential Applications
This technology can be applied in various fields such as data management, fraud detection, and quality control processes.
Problems Solved
- Reduces manual effort in identifying duplicate data entries.
- Improves data accuracy and integrity.
- Enhances efficiency in data processing tasks.
Benefits
- Saves time and resources in data management.
- Minimizes errors in data entry and analysis.
- Streamlines decision-making processes based on accurate data.
Commercial Applications
- Data management software tools for businesses.
- Fraud detection systems for financial institutions.
- Quality control processes in manufacturing industries.
Questions about the Technology
How does the system determine candidate duplicate data entries?
The system determines candidate duplicate data entries based on the proximity in time between similar data entries.
What is the significance of using a classification model trained on historical dispute entries?
The classification model trained on historical dispute entries helps in automatically determining duplicate data entries accurately.
By using advanced algorithms and historical data, this technology offers a sophisticated solution to the common problem of duplicate data entries, providing businesses with a more efficient and accurate way to manage their data.
Original Abstract Submitted
systems and methods of the present disclosure enable a processor to automatically detect duplicate data entries by receiving data entries associated with a user, where each data entry includes a value, a time, an entity identifier, and a location. pairs of similar data entries are determined by matching the entity identifier and the location pairs data entries. candidate duplicate data entries are determined based on a proximity in time between data entries of the similar data entries. for each candidate duplicate data entry, a feature vector is generated including the entity identifier, location, value and time, and each feature vector is submitted to a duplicate classification model to automatically determine duplicate data entries from the candidate duplicate data entries, the duplicate classification model being trained according to a historical dispute entries.