18157898. MAINTAINING A DATASET BASED ON PERIODIC CLEANSING OF RAW SOURCE DATA simplified abstract (Capital One Services, LLC)

From WikiPatents
Jump to navigation Jump to search

MAINTAINING A DATASET BASED ON PERIODIC CLEANSING OF RAW SOURCE DATA

Organization Name

Capital One Services, LLC

Inventor(s)

Brice Elder of Mishawaka IN (US)

Aditya Pai of San Francisco CA (US)

Julie Murakami of New York NY (US)

MAINTAINING A DATASET BASED ON PERIODIC CLEANSING OF RAW SOURCE DATA - A simplified explanation of the abstract

This abstract first appeared for US patent application 18157898 titled 'MAINTAINING A DATASET BASED ON PERIODIC CLEANSING OF RAW SOURCE DATA

Simplified Explanation

The abstract describes a data cleaning platform that assigns a unique identifier to each data record in a dataset based on specific fields. It also generates a separate dataset for new transactions and updates the original dataset accordingly.

  • The data cleaning platform assigns a unique identifier to each data record in a dataset.
  • It generates a separate dataset for new transactions that occurred after the original dataset was created.
  • The platform updates the original dataset by joining the new transaction data with the existing dataset.

Potential Applications

  • Data cleaning and organization in various industries such as finance, healthcare, and retail.
  • Streamlining data processing and analysis tasks.
  • Improving data accuracy and reliability.

Problems Solved

  • Eliminates the need for manual identification and matching of data records.
  • Ensures that new transaction data is properly integrated into the existing dataset.
  • Reduces errors and inconsistencies in data cleaning and organization processes.

Benefits

  • Saves time and effort by automating the data cleaning and updating process.
  • Increases the accuracy and reliability of datasets by assigning unique identifiers.
  • Enables efficient analysis and decision-making based on up-to-date and comprehensive data.


Original Abstract Submitted

In some implementations, a data cleaning platform may determine a respective entity key for each data record in a cleansed dataset based on a combination of fields, in each data record, that contain information that uniquely identifies an entity associated with a respective data record. The data cleaning platform may generate a delta dataset based on a set of uncleansed data records related to transactions that occurred after a time when the cleansed dataset was first generated. For example, in some implementations, each uncleansed data record in the delta dataset may be associated with a corresponding entity key based on the combination of fields. The data cleaning platform may perform a data join to update the cleansed dataset to include data records related to the transactions that occurred after the time when the cleansed dataset was first generated.