Please use this identifier to cite or link to this item:
Title: An Empirical Methodology to Analyze the Behavior of Bagging
Authors: Fábio Hernâni Pinto
Carlos Manuel Soares
João Mendes Moreira
Issue Date: 2014
Abstract: In this paper we propose and apply a methodology to study the relationship between the performance of bagging and the characteristics of the bootstrap samples. The methodology consists of 1) an extensive set of experiments to estimate the empirical distribution of performance of the population of all possible ensembles that can be created with those bootstraps and 2) a metalearning approach to analyze that distribution based on characteristics of the bootstrap samples and their relationship with the complete training set. Given the large size of the population of all ensembles, we empirically show that it is possible to apply the methodology to a sample. We applied the methodology to 53 classification datasets for ensembles of 20 and 100 models. Our results show that diversity is crucial for an important bootstrap and we show evidence of a metric that can measure diversity without any learning process involved. We also found evidence that the best bootstraps have a predictive power very similar to the one presented by the training set using naive models.
metadata.dc.type: conferenceObject
Appears in Collections:CESE - Indexed Articles in Conferences
LIAAD - Indexed Articles in Conferences

Files in This Item:
File Description SizeFormat 
P-00A-1F1.pdf3.05 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.