Holistic Shuffler for the Parallel Processing of SQL Window Functions
Holistic Shuffler for the Parallel Processing of SQL Window Functions
Files
Date
2016
Authors
Fábio André Coelho
José Orlando Pereira
Ricardo Pereira Vilaça
Rui Carlos Oliveira
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Window functions are a sub-class of analytical operators that allow data to be handled in a derived view of a given relation, while taking into account their neighboring tuples. Currently, systems bypass parallelization opportunities which become especially relevant when considering Big Data as data is naturally partitioned. We present a shuffling technique to improve the parallel execution of window functions when data is naturally partitioned when the query holds a partitioning clause that does not match the natural partitioning of the relation. We evaluated this technique with a non-cumulative ranking function and we were able to reduce data transfer among parallel workers in 85% when compared to a naive approach.