Resource title

Hypothesis stability of randomized algorithms with an application to bootstrap methods

Resource image

image for OpenScout resource :: Hypothesis stability of randomized algorithms with an application to bootstrap methods

Resource description

The authors prove non-asymptotic bounds on the predictive performance of the solutions of randomized learning (regression or classification) methods using the notion of random hypothesis stability, a measure of how much changes in the training data influence the solution of the learning method, that they define for such methods. They apply the results to bootstrapping, in particular to bagging (bootstrap aggregating) regression or classification methods. They prove non-asymptotic bounds on the predictive performance of bagging for certain cases and formally show that for unstable learning methods bagging leads to higher random hypothesis stability. The authors also show bounds for bagging using small sub-samples without replacement, namely subagging, which are tighter and depend on the size of the sub-samples used. The results imply that when using small sub-samples the empirical error of subagging is close to the expected error. They therefore justify non-asymptotically the use of the empirical error of subagging for model selection. They report experiments on a number of datasets using Support Vector Machines (SVM), decision trees, and neural networks that validate the theory and confirm that subagging using small random sub-samples is an appropriate technique for improving the stability of learning methods.

Resource author

Resource publisher

Resource publish date

Resource language

en

Resource content type

application/pdf

Resource resource URL

http://flora.insead.edu/fichiersti_wp/inseadwp2002/2002-74.pdf

Resource license

Copyright INSEAD. All rights reserved