Jump to content

18474399. CACHE-EFFICIENT TOP-K AGGREGATION OVER HIGH CARDINALITY LARGE DATASETS (Microsoft Technology Licensing, LLC)

From WikiPatents

CACHE-EFFICIENT TOP-K AGGREGATION OVER HIGH CARDINALITY LARGE DATASETS

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Tarique Ashraf Siddiqui of Redmond WA US

Vivek Ravindranath Narasayya of Redmond WA US

Marius Dumitru of Issaquah WA US

Surajit Chaudhuri of Kirkland WA US

CACHE-EFFICIENT TOP-K AGGREGATION OVER HIGH CARDINALITY LARGE DATASETS

This abstract first appeared for US patent application 18474399 titled 'CACHE-EFFICIENT TOP-K AGGREGATION OVER HIGH CARDINALITY LARGE DATASETS

Original Abstract Submitted

A data processing system implements a cache-conscious aggregation framework for cache-efficient top-k aggregation over high cardinality large datasets. The framework leverages skew in the distribution of data in the datasets to minimize data movements within the local caches of the cores of the multicore processors of the data processing system. The framework performs representative sampling on the dataset and utilizes these samples to identify candidate groups in the dataset for the top-k results. The system performs exact aggregations for the candidate groups and performs hashing and pruning on the non-candidate groups in the dataset to identify top-k results included in the non-candidate groups without having to calculate the exact aggregations for the non-candidate groups.

Cookies help us deliver our services. By using our services, you agree to our use of cookies.