Toward a Practical and Timely Diagnosis of Application's I/O Behavior

No Thumbnail Available
Date
2023
Authors
Ricardo Gonçalves Macedo
Tânia Conceição Araújo
Rui Carlos Oliveira
João Tiago Paulo
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
We present DIO, a generic tool for observing inefficient and erroneous I/O interactions between applications and in-kernel storage backends that lead to performance, dependability, and correctness issues. DIO eases the analysis and enables near real-time visualization of complex I/O patterns for data-intensive applications generating millions of storage requests. This is achieved by non-intrusively intercepting system calls, enriching collected data with relevant context, and providing timely analysis and visualization for traced events. We demonstrate its usefulness by analyzing four production-level applications. Results show that DIO enables diagnosing inefficient I/O patterns that lead to poor application performance, unexpected and redundant I/O calls caused by high-level libraries, resource contention in multithreaded I/O that leads to high tail latency, and erroneous file accesses that cause data loss. Moreover, through a detailed evaluation, we show that, when comparing DIO's inline diagnosis pipeline with a similar state-of-the-art solution, our system captures up to 28x more events while keeping tracing performance overhead between 14% and 51%.
Description
Keywords
Citation