Towards an accurate evaluation of deduplicated storage systems

dc.contributor.author João Tiago Paulo en
dc.contributor.author Reis,P en
dc.contributor.author José Orlando Pereira en
dc.contributor.author António Luís Sousa en
dc.date.accessioned 2017-12-15T12:35:52Z
dc.date.available 2017-12-15T12:35:52Z
dc.date.issued 2013 en
dc.description.abstract Deduplication has proven to be a valuable technique for eliminating duplicate data in backup and archival systems and is now being applied to new storage environments with distinct requirements and performance trade-offs. Namely, deduplication system are now targeting large-scale cloud computing storage infrastructures holding unprecedented data volumes with a significant share of duplicate content. It is however hard to assess the usefulness of deduplication in particular settings and what techniques provide the best results. In fact, existing disk I/O benchmarks follow simplistic approaches for generating data content leading to unrealistic amounts of duplicates that do not evaluate deduplication systems accurately. Moreover, deduplication systems are now targeting heterogeneous storage environments, with specific duplication ratios, that benchmarks must also simulate. We address these issues with DEDISbench, a novel micro-benchmark for evaluating disk I/O performance of block based deduplication systems. As the main contribution, DEDISbench generates content by following realistic duplicate content distributions extracted from real datasets. Then, as a second contribution, we analyze and extract the duplicates found on three real storage systems, proving that DEDISbench can easily simulate several workloads. The usefulness of DEDISbench is shown by comparing it with Bonnie++ and IOzone open-source disk I/O micro-benchmarks on assessing two open-source deduplication systems, Opendedup and Lessfs, using Ext4 as a baseline. Our results lead to novel insight on the performance of these file systems. en
dc.identifier.uri http://repositorio.inesctec.pt/handle/123456789/4158
dc.language eng en
dc.relation 5638 en
dc.relation 5602 en
dc.relation 5621 en
dc.rights info:eu-repo/semantics/openAccess en
dc.title Towards an accurate evaluation of deduplicated storage systems en
dc.type article en
dc.type Publication en
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
P-009-83T.pdf
Size:
276.3 KB
Format:
Adobe Portable Document Format
Description: