International business machines corporation (20240104009). GENERATING TEST DATA FOR APPLICATION PERFORMANCE simplified abstract

From WikiPatents
Jump to navigation Jump to search

GENERATING TEST DATA FOR APPLICATION PERFORMANCE

Organization Name

international business machines corporation

Inventor(s)

Anna Tatunashvili of Bratislava (SK)

Rupam Bhattacharjee of Karimganj (IN)

Siba Prasad Satapathy of Bangalore (IN)

George Thayyil Jacob Sushil of Bangalore (IN)

[[:Category:Jozef Feki� of Dublin (IE)|Jozef Feki� of Dublin (IE)]][[Category:Jozef Feki� of Dublin (IE)]]

GENERATING TEST DATA FOR APPLICATION PERFORMANCE - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240104009 titled 'GENERATING TEST DATA FOR APPLICATION PERFORMANCE

Simplified Explanation

The patent application describes a method to improve the extraction of test datasets for testing and resource optimization in distributed data processing engines.

  • Test run on full dataset of a job to identify bottlenecks through run-time monitoring interface
  • Run-time metrics analysis, source code analysis, and source data impact analysis of distributed data processing engine
  • Generation of impact scoring table of job transformations based on source code analysis
  • Generation of data extraction rules based on impact scoring table
  • Extraction of test dataset based on data extraction rules
  • Evaluation of data extraction rules against user-defined thresholds to prepare a representative test dataset
  • Output of representative test dataset through user interface on computing device to user

Potential Applications

This technology can be applied in various industries such as data analytics, software development, and quality assurance to optimize testing processes and resource allocation.

Problems Solved

1. Identifying bottlenecks in distributed data processing jobs 2. Efficient extraction of test datasets for testing purposes

Benefits

1. Improved testing efficiency and resource optimization 2. Enhanced performance of distributed data processing engines 3. Streamlined data extraction process for testing purposes

Potential Commercial Applications

Optimizing testing processes in software development companies Enhancing data analytics capabilities in research institutions Improving quality assurance procedures in various industries

Possible Prior Art

There may be existing technologies or methods for analyzing and optimizing distributed data processing jobs, but the specific approach outlined in this patent application may be novel and innovative.

Unanswered Questions

How does this technology compare to existing methods for test dataset extraction in distributed data processing engines?

This article does not provide a direct comparison with existing methods for test dataset extraction in distributed data processing engines.

What are the potential limitations or challenges of implementing this technology in real-world scenarios?

This article does not address the potential limitations or challenges of implementing this technology in real-world scenarios.


Original Abstract Submitted

in an approach to improve the extracting test datasets for testing and resource optimization, embodiments execute a test run on a full dataset of a job, and identify existing bottlenecks in the job through a run-time monitoring interface. additionally, embodiments execute a run-time metrics analysis, a source code analysis, and a source data impact analysis of a distributed data processing engine executing a distributed data processing job, and generate, by an analysis and impact scoring engine, an impact scoring table of job transformations based on the source code analysis. furthermore, embodiments generate data extraction rules based on the impact scoring table, and extract a test dataset based on the data extraction rules. moreover, embodiments evaluate the data extraction rules against user defined thresholds, and prepare a representative test dataset, and output, through a user interface on a computing device, the representative test dataset to a user.