Monitoring Incremental Histogram Distribution for Change Detection in Data Streams

No Thumbnail Available
Date
2010
Authors
Pedro Pereira Rodrigues
Raquel Sebastião
João Gama
João Bernardes
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Histograms are a common technique for density estimation and they have been widely used as a tool in exploratory data analysis. Learning histograms from static and stationary data is a well known topic. Nevertheless, very few works discuss this problem when we have a continuous flow of data generated from dynamic environments. The scope of this paper is to detect changes from high-speed time-changing data streams. To address this problem, we construct histograms able to process examples once at the rate they arrive. The main goal of this work is continuously maintain a histogram consistent with the current status of the nature. We study strategies to detect changes in the distribution generating examples, and adapt the histogram to the most recent data by forgetting outdated data. We use the Partition Incremental Discretization algorithm that was designed to learn histograms from high-speed data streams. We present a method to detect whenever a change in the distribution generating e
Description
Keywords
Citation