Databricks, Inc. (20240256550). Dictionary Filtering and Evaluation in Columnar Databases simplified abstract

From WikiPatents
Jump to navigation Jump to search

Dictionary Filtering and Evaluation in Columnar Databases

Organization Name

Databricks, Inc.

Inventor(s)

Utkarsh Agarwal of San Francisco CA (US)

Shoumik Palkar of San Francisco CA (US)

Alexander Behm of San Francisco CA (US)

Sriram Krishnamurthy of San Francisco CA (US)

Dictionary Filtering and Evaluation in Columnar Databases - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240256550 titled 'Dictionary Filtering and Evaluation in Columnar Databases

The abstract describes a method for evaluating a query on a columnar dataset with dictionaries associated with columns. The method involves determining whether to perform dictionary filtering for the query based on certain factors.

  • Simplified Explanation:

The method evaluates queries on a columnar dataset with dictionaries, deciding whether to use dictionary filtering based on specific factors.

  • Key Features and Innovation:

- Evaluation of queries on columnar datasets with dictionaries - Determination of whether to perform dictionary filtering based on factors

  • Potential Applications:

- Data analysis - Database management - Cloud storage optimization

  • Problems Solved:

- Efficient query evaluation on columnar datasets - Improved performance in data analysis

  • Benefits:

- Faster query processing - Enhanced data retrieval accuracy - Optimal use of cloud storage resources

  • Commercial Applications:

Optimizing cloud storage for efficient data analysis and database management

  • Questions about the Technology:

1. How does the method determine whether to perform dictionary filtering for a query? 2. What are the advantages of using dictionaries in columnar datasets?

  • Frequently Updated Research:

Stay updated on advancements in query evaluation methods for columnar datasets with dictionaries.


Original Abstract Submitted

disclosed herein is a method, system, or non-transitory computer readable medium for evaluating a query on a columnar dataset comprising one or more dictionaries associated with columns in the dataset. the method includes receiving a request to perform a query comprising at least a operator and a request to return information about a value of interest in a columnar dataset stored on cloud storage. at least one column in the columnar dataset is based on a dictionary. the dictionary maps one or more values for a column to one or more respective identifiers. the method determines whether to perform dictionary filtering for the query by calculating a metric based on one or more factors. responsive to the metric being below a threshold, which may be predetermined, the method performs the dictionary filtering.