20240012829. MANAGING EXTRACT, TRANSFORM AND LOAD SYSTEMS simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)

From WikiPatents
Jump to navigation Jump to search

MANAGING EXTRACT, TRANSFORM AND LOAD SYSTEMS

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

Chengxuan Xing of Romsey (GB)

Doina Liliana Klinger of Winchester (GB)

Alexander Robert Wood of Romsey (GB)

Tom Soal of Whitehill (GB)

MANAGING EXTRACT, TRANSFORM AND LOAD SYSTEMS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240012829 titled 'MANAGING EXTRACT, TRANSFORM AND LOAD SYSTEMS

Simplified Explanation

The abstract describes an approach to implement an Extract, Transform, and Load (ETL) system with a queue for holding data between extraction and transformation. If data encounters a rate limit error during the load phase, it is requeued for resubmission. The queue is monitored, and if too many requeued data units are detected, it triggers active pacing management. A retry schedule is defined for the requeued data units, and extraction is temporarily halted to allow retransformation. After the suspension is lifted, a pacing delay is inserted between subsequent extract events to avoid load phase bottlenecks.

  • The approach implements an ETL system with a queue for data management.
  • Requeued data units are monitored and managed to avoid rate limit errors.
  • A retry schedule is defined for requeued data units.
  • Extraction is temporarily halted to allow retransformation of requeued data.
  • Pacing delay is inserted between subsequent extract events to prevent bottlenecks.

Potential Applications:

  • Data integration and migration processes.
  • Real-time data processing and analysis.
  • Data warehousing and business intelligence systems.

Problems Solved:

  • Avoiding rate limit errors during the load phase.
  • Managing and reprocessing data units that encounter errors.
  • Preventing bottlenecks in the load phase of the ETL process.

Benefits:

  • Improved data reliability and accuracy.
  • Efficient handling of rate limit errors.
  • Enhanced scalability and performance of ETL systems.
  • Reduced downtime and improved data processing speed.


Original Abstract Submitted

an approach to implement an extract, transform and load system, a queue is provided for holding units of data between extraction and transformation. when units of data suffer a rate limit error in the load phase, they are requeued so they can be resubmitted for transformation. the contents of the queue are monitored and, if too many requeued units of data are detected in the queue, then this is taken as an indicator of an unacceptable number of rate limit errors and active pacing management is triggered. a retry schedule is defined for the requeued units of data. extraction is temporarily halted to allow the requeued units of data to be retransformed without more units of data queuing up. then, after the suspension is lifted, a pacing delay is inserted between subsequent extract events to avoid the same load phase bottleneck recurring.