Avoiding Anomalies in Data Stream Learning

dc.contributor.author João Gama en
dc.contributor.author Kosina,P en
dc.contributor.author Ezilda Duarte Almeida en
dc.date.accessioned 2018-01-03T10:39:10Z
dc.date.available 2018-01-03T10:39:10Z
dc.date.issued 2013 en
dc.description.abstract The presence of anomalies in data compromises data quality and can reduce the effectiveness of learning algorithms. Standard data mining methodologies refer to data cleaning as a pre-processing before the learning task. The problem of data cleaning is exacerbated when learning in the computational model of data streams. In this paper we present a streaming algorithm for learning classification rules able to detect contextual anomalies in the data. Contextual anomalies are surprising attribute values in the context defined by the conditional part of the rule. For each example we compute the degree of anomaliness based on the probability of the attribute-values given the conditional part of the rule covering the example. The examples with high degree of anomaliness are signaled to the user and not used to train the classifier. The experimental evaluation in real-world data sets shows the ability to discover anomalous examples in the data. The main advantage of the proposed method is the ability to inform the context and explain why the anomaly occurs. en
dc.identifier.uri http://repositorio.inesctec.pt/handle/123456789/5365
dc.identifier.uri http://dx.doi.org/10.1007/978-3-642-40897-7_4 en
dc.language eng en
dc.relation 5120 en
dc.relation 5296 en
dc.rights info:eu-repo/semantics/openAccess en
dc.title Avoiding Anomalies in Data Stream Learning en
dc.type conferenceObject en
dc.type Publication en
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
P-008-HMP.pdf
Size:
398.33 KB
Format:
Adobe Portable Document Format
Description: