Distributed clustering of ubiquitous data streams

dc.contributor.author Pedro Pereira Rodrigues en
dc.contributor.author João Gama en
dc.date.accessioned 2017-11-20T10:40:44Z
dc.date.available 2017-11-20T10:40:44Z
dc.date.issued 2014 en
dc.description.abstract Nowadays information is generated and gathered from distributed streaming data sources, stressing communications and computing infrastructure, making it hard to transmit, compute, and store. Knowledge discovery from ubiquitous data streams has become a major goal for all sorts of applications, mostly based on unsupervised techniques such as clustering. Two subproblems exist: clustering streaming data observations and clustering streaming data sources. The former searches for dense regions of the data space, identifying hot spots where data sources tend to produce data, while the latter finds groups of sources that behave similarly over time. In order to better assess the current status of this topic, this article presents a thorough review on distributed algorithms addressing either of the subproblems. We characterize clustering algorithms for ubiquitous data streams, discussing advantages and disadvantages of distributed procedures. Overall, distributed stream clustering methods improve communication ratios, processing speed, and resources consumption, while achieving similar clustering validity as the centralized counterparts. (C) 2013 John Wiley & Sons, Ltd. en
dc.identifier.uri http://repositorio.inesctec.pt/handle/123456789/3560
dc.identifier.uri http://dx.doi.org/10.1002/widm.1109 en
dc.language eng en
dc.relation 5237 en
dc.relation 5120 en
dc.rights info:eu-repo/semantics/openAccess en
dc.title Distributed clustering of ubiquitous data streams en
dc.type article en
dc.type Publication en
Files