DEDIS: Distributed exact deduplication for primary storage infrastructures

Thumbnail Image
Date
2013
Authors
João Tiago Paulo
José Orlando Pereira
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Deduplication is now widely accepted as an efficient technique for reducing storage costs at the expense of some processing overhead, being increasingly sought in primary storage systems [7, 8] and cloud computing infrastructures holding Virtual Machine (VM) volumes [2, 1, 5]. Besides a large number of duplicates that can be found across static VM images [3], dynamic general purpose data from VM volumes allows space savings from 58% up to 80% if deduplicated in a cluster-wide fashion [1, 4]. However, some of these volumes persist latency sensitive data which limits the overhead that can be incurred in I/O operations. Therefore, this problem must be addressed by a cluster-wide distributed deduplication system for such primary storage volumes.
Description
Keywords
Citation