The problem of proving generalization bounds for the performance of learning algorithms can be formulated as a problem of bounding the bias and variance of estimators of the expected error. We show how various stability assumptions can be employed for this purpose. We provide a necessary and sufficient stability condition for bounding the bias and variance for the Empirical Risk Minimization algorithm, and various sufficient conditions for bounding bias and ...
Stability conditions can be thought of as a way of controlling the variance of the learning process. Strong stability conditions additionally imply concentration of certain quantities around their expected values. It was shown recently that stability of learning algorithms is closely related to their generalization and consistency. In this paper we examine stability conditions from this point of view.
In this paper we focus on the problem of estimating a bounded density using a finite combination of densities from a given class. We consider the Maximum Likelihood Procedure (MLE) and the greedy procedure described by Li and Barron. Approximation and estimation bounds are given for the above methods. We extend and improve upon the estimation results of Li and Barron, and in particular prove a bound on the estimation ...
Solutions of learning problems by Empirical Risk Minimization (ERM) -- and almost-ERM when the minimizer does not exist -- need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of leave-one-out stability, called CVEEE(loo) stability. Our main new results are two. We prove that for bounded ...
Intuitively, we expect that averaging - or bagging - different regressors with low correlation should smooth their behavior and be somewhat similar to regularization. In this note we make this intuition precise. Using an almost classical definition of stability, we prove that a certain form of averaging provides generalization bounds with a rate of convergence of the same order as Tikhonov regularization - similar to fashionable RKHS-based learning algorithms,
We present a two-step method to speed-up object detection systems in computer vision that use Support Vector Machines (SVMs) as classifiers. In a first step we perform feature reduction by choosing relevant image features according to a measure derived from statistical learning theory. In a second step we build a hierarchy of classifiers. On the bottom level, a simple and fast classifier analyzes the whole image and rejects large parts ...