Simulation of the ensemble generation process: The divergence between data and model similarity
Simulation of the ensemble generation process: The divergence between data and model similarity
No Thumbnail Available
Date
2014
Authors
Pinto,F
João Mendes Moreira
Carlos Manuel Soares
Rossetti,RJF
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this paper we present a Netlogo simulation model for a Data Mining methodological process: ensemble classifier generation. The model allows to study the trade-off between data characteristics and diversity, a key concept in Ensemble Learning. We studied the re™ search hypothesis that data characteristics should also be taken into account while generating ensemble classifier models. The results of our experiments indicate that diversity is in fact a key concept in Ensemble Learning but regarding our research hypothesis, the findings axe inconclusive.