An Empirical Methodology to Analyze the Behavior of Bagging

dc.contributor.author Fábio Hernâni Pinto en
dc.contributor.author Carlos Manuel Soares en
dc.contributor.author João Mendes Moreira en
dc.date.accessioned 2017-11-20T10:47:59Z
dc.date.available 2017-11-20T10:47:59Z
dc.date.issued 2014 en
dc.description.abstract In this paper we propose and apply a methodology to study the relationship between the performance of bagging and the characteristics of the bootstrap samples. The methodology consists of 1) an extensive set of experiments to estimate the empirical distribution of performance of the population of all possible ensembles that can be created with those bootstraps and 2) a metalearning approach to analyze that distribution based on characteristics of the bootstrap samples and their relationship with the complete training set. Given the large size of the population of all ensembles, we empirically show that it is possible to apply the methodology to a sample. We applied the methodology to 53 classification datasets for ensembles of 20 and 100 models. Our results show that diversity is crucial for an important bootstrap and we show evidence of a metric that can measure diversity without any learning process involved. We also found evidence that the best bootstraps have a predictive power very similar to the one presented by the training set using naive models. en
dc.identifier.uri http://repositorio.inesctec.pt/handle/123456789/3616
dc.identifier.uri http://dx.doi.org/10.1007/978-3-319-14717-8_16 en
dc.language eng en
dc.relation 5001 en
dc.relation 5450 en
dc.relation 5832 en
dc.rights info:eu-repo/semantics/openAccess en
dc.title An Empirical Methodology to Analyze the Behavior of Bagging en
dc.type conferenceObject en
dc.type Publication en
Files