Prepared scan: efficient retrieval of structured data from HBase

Thumbnail Image
Date
2017
Authors
Francisco Teixeira Neves
Vilaça,Ricardo
José Orlando Pereira
Rui Carlos Oliveira
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The ability of NoSQL systems to scale better than traditional relational databases motivates a large set of applications to migrate their data to NoSQL systems, even without aiming to exploit the provided schema exibility. However, accessing structured data is costly due to such exibility, incurring in a lot of bandwidth and processing unit usage. In this paper, we analyse this cost in Apache HBase and propose a new scan operation, named Prepared Scan, that optimizes the access to data structured in a regular manner by taking advantage of a well-known schema by application. Using an industry standard benchmark, we show that Prepared Scan improves throughput up to 29% and decreases network bandwidth consumption up to 20%. © 2017 ACM.
Description
Keywords
Citation