FILE INGESTION IN A MULTI-TENANT CLOUD ENVIRONMENT

Organization Name

International Business Machines Corporation

Inventor(s)

FILE INGESTION IN A MULTI-TENANT CLOUD ENVIRONMENT - A simplified explanation of the abstract

This abstract first appeared for US patent application 17526622 titled 'FILE INGESTION IN A MULTI-TENANT CLOUD ENVIRONMENT

Simplified Explanation

The patent application describes a system and method for estimating the time it takes to transform ingested files into a searchable state for content mining. This can be done in an on-premises computing environment or a cloud environment, including multi-tenant cloud environments.

The system analyzes the ingested files to determine if they can be divided into smaller elements, such as lines of data.
The ingestion time is dependent on the number of divisible elements and the amount of data per element.
A converter is used to divide the files into multiple elements and calculate the estimated ingestion time based on the number of divisions and file size for each element.
The estimated ingestion time is stored in the index for the search data, corresponding to each divisible element.
During content mining, a condition is added to search queries to only display results where the estimated ingestion time is older than the current time.

Potential Applications

This technology can be used in various industries that require content mining, such as data analytics, information retrieval, and machine learning.
It can be applied in research institutions, financial services, healthcare, and e-commerce to efficiently process and search through large volumes of data.

Problems Solved

The system solves the problem of accurately estimating the time it takes to transform ingested files into a searchable state, considering the divisibility of the files and the amount of data per element.
It addresses the challenge of efficiently indexing and searching through large volumes of data in on-premises or cloud environments.

Benefits

The system provides a more accurate estimation of ingestion time, allowing users to plan and manage their content mining tasks effectively.
By dividing files into smaller elements, the system can process and index data more efficiently, reducing the overall time required for content mining.
The ability to add a condition based on estimated ingestion time in search queries helps users filter and prioritize search results based on freshness.

Original Abstract Submitted

Systems, methods, and computer programming products for estimating ingestion time of ingested files to be transformed into a searchable state for content mining by an on-premises computing environment or cloud environment, including multi-tenant cloud environments. Ingested files being indexed are analyzed for divisibility. Ingestion time varies based on the number of divisible elements (such as lines) of data within the ingested file and the amount of data per divisible element. A converter divides files into a plurality of elements treated as independent data and calculates the estimated ingestion time based on the number of divisions and file size for each divisible element. Estimated ingestion time is stored to internal fields corresponding to each divisible element in the index for the search data. During content mining, an internal condition is added to received search queries, displaying only search results where the estimated ingestion time is older than the current time.

17526622. FILE INGESTION IN A MULTI-TENANT CLOUD ENVIRONMENT simplified abstract (International Business Machines Corporation)

Contents

FILE INGESTION IN A MULTI-TENANT CLOUD ENVIRONMENT

Organization Name

Inventor(s)