Please use this identifier to cite or link to this item: http://repositorio.inesctec.pt/handle/123456789/5339
Title: Concept Neurons - Handling Drift Issues for Real-Time Industrial Data Mining
Authors: Luís Moreira Matias
João Gama
João Mendes Moreira
Issue Date: 2016
Abstract: Learning from data streams is a challenge faced by data science professionals from multiple industries. Most of them struggle hardly on applying traditional Machine Learning algorithms to solve these problems. It happens so due to their high availability on ready-to-use software libraries on big data technologies (e.g. SparkML). Nevertheless, most of them cannot cope with the key characteristics of this type of data such as high arrival rate and/or non-stationary distributions. In this paper, we introduce a generic and yet simplistic framework to fill this gap denominated Concept Neurons. It leverages on a combination of continuous inspection schemas and residual-based updates over the model parameters and/or the model output. Such framework can empower the resistance of most of induction learning algorithms to concept drifts. Two distinct and hence closely related flavors are introduced to handle different drift types. Experimental results on successful distinct applications on different domains along transportation industry are presented to uncover the hidden potential of this methodology.
URI: http://repositorio.inesctec.pt/handle/123456789/5339
http://dx.doi.org/10.1007/978-3-319-46131-1_18
metadata.dc.type: conferenceObject
Publication
Appears in Collections:LIAAD - Articles in International Conferences

Files in This Item:
File Description SizeFormat 
P-00K-THA.pdf449.7 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.