17492255. WORKLOAD GENERATION FOR OPTIMAL STRESS TESTING OF BIG DATA MANAGEMENT SYSTEMS simplified abstract (International Business Machines Corporation)

From WikiPatents
Jump to navigation Jump to search

WORKLOAD GENERATION FOR OPTIMAL STRESS TESTING OF BIG DATA MANAGEMENT SYSTEMS

Organization Name

International Business Machines Corporation

Inventor(s)

Ilker Ender of Dublin (IE)

Austin Clifford of Glenageary (IE)

Pedro Miguel Barbas of Dunboyne (IE)

Mara Elisa De Paiva Fernandes Matias of Dublin (IE)

Hemant Asandas Bhatia of Olathe KS (US)

WORKLOAD GENERATION FOR OPTIMAL STRESS TESTING OF BIG DATA MANAGEMENT SYSTEMS - A simplified explanation of the abstract

This abstract first appeared for US patent application 17492255 titled 'WORKLOAD GENERATION FOR OPTIMAL STRESS TESTING OF BIG DATA MANAGEMENT SYSTEMS

Simplified Explanation

The patent application describes a computer-based method for stress testing big data management systems. Here are the key points:

  • The method generates a set of random test queries to evaluate the performance of the system.
  • The queries are compiled to determine the data points of the features being queried, such as the type of table.
  • A distance measurement, such as Mahalanobis distance, is calculated between the data points of the queries and the mean of a distribution of data points for each feature.
  • Queries with distances exceeding a threshold are ranked based on their distance.
  • The ranked queries are then executed in order of rank.
  • Any queries that result in errors or system failures are logged for further analysis and stress testing.

Potential applications of this technology:

  • This technology can be used by developers and system administrators to evaluate the performance and stability of big data management systems.
  • It can help identify queries that may cause system failures or errors, allowing for targeted stress testing and optimization.

Problems solved by this technology:

  • Stress testing big data management systems can be time-consuming and resource-intensive. This method automates the process by generating random test queries and ranking them based on their potential impact.
  • It helps identify problematic queries that may cause system failures or errors, allowing for targeted optimization and improvement.

Benefits of this technology:

  • The method provides a systematic approach to stress testing big data management systems, saving time and resources.
  • It helps identify and prioritize problematic queries, allowing for targeted optimization and improvement.
  • By logging queries that result in errors or system failures, it provides valuable information for further analysis and stress testing.


Original Abstract Submitted

A computer-implemented method, system and computer program product for optimally performing stress testing against big data management systems. A set of random test queries is generated and compiled to determine the data points of the features (e.g., table type being queried) of the set of random test queries. A distance (e.g., Mahalanobis distance) is then measured between the data points of the features and the mean of a distribution of data points corresponding to each same feature of an extracted feature set. Each random test query whose distance exceeds a threshold distance is then ranked. The ranked random test queries are then executed in order of rank. Those executed random test queries which resulted in an error (e.g., system failure) are added to a log, which is used to identify those queries to perform a stress test against the big data management system.