Soft margins for AdaBoost

G. Rätsch, T. Onoda, Klaus Muller

Research output: Contribution to journalArticle

830 Citations (Scopus)

Abstract

Recently ensemble methods like ADABOOST have been applied successfully in many problems, while seemingly defying the problems of overfitting. ADABOOST rarely overfits in the low noise regime, however, we show that it clearly does so for higher noise levels. Central to the understanding of this fact is the margin distribution. ADABOOST can be viewed as a constraint gradient descent in an error function with respect to the margin. We find that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors. A hard margin is clearly a sub-optimal strategy in the noisy case, and regularization, in our case a 'mistrust' in the data, must be introduced in the algorithm to alleviate the distortions that single difficult patterns (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original ADABOOST algorithm to achieve a soft margin. In particular we suggest (1) regularized ADABOOSTREG where the gradient decent is done directly with respect to the soft margin and (2) regularized linear and quadratic programming (LP/QP-) ADABOOST, where the soft margin is attained by introducing slack variables. Extensive simulations demonstrate that the proposed regularized ADABOOST-type algorithms are useful and yield competitive results for noisy data.

Original languageEnglish
Pages (from-to)287-320
Number of pages34
JournalMachine Learning
Volume42
Issue number3
DOIs
Publication statusPublished - 2001 Mar 1
Externally publishedYes

Fingerprint

Adaptive boosting
Quadratic programming
Linear programming

ASJC Scopus subject areas

  • Artificial Intelligence
  • Control and Systems Engineering

Cite this

Rätsch, G., Onoda, T., & Muller, K. (2001). Soft margins for AdaBoost. Machine Learning, 42(3), 287-320. https://doi.org/10.1023/A:1007618119488

Soft margins for AdaBoost. / Rätsch, G.; Onoda, T.; Muller, Klaus.

In: Machine Learning, Vol. 42, No. 3, 01.03.2001, p. 287-320.

Research output: Contribution to journalArticle

Rätsch, G, Onoda, T & Muller, K 2001, 'Soft margins for AdaBoost', Machine Learning, vol. 42, no. 3, pp. 287-320. https://doi.org/10.1023/A:1007618119488
Rätsch G, Onoda T, Muller K. Soft margins for AdaBoost. Machine Learning. 2001 Mar 1;42(3):287-320. https://doi.org/10.1023/A:1007618119488
Rätsch, G. ; Onoda, T. ; Muller, Klaus. / Soft margins for AdaBoost. In: Machine Learning. 2001 ; Vol. 42, No. 3. pp. 287-320.
@article{a909c8d006204e1ebd06845c24fc403a,
title = "Soft margins for AdaBoost",
abstract = "Recently ensemble methods like ADABOOST have been applied successfully in many problems, while seemingly defying the problems of overfitting. ADABOOST rarely overfits in the low noise regime, however, we show that it clearly does so for higher noise levels. Central to the understanding of this fact is the margin distribution. ADABOOST can be viewed as a constraint gradient descent in an error function with respect to the margin. We find that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors. A hard margin is clearly a sub-optimal strategy in the noisy case, and regularization, in our case a 'mistrust' in the data, must be introduced in the algorithm to alleviate the distortions that single difficult patterns (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original ADABOOST algorithm to achieve a soft margin. In particular we suggest (1) regularized ADABOOSTREG where the gradient decent is done directly with respect to the soft margin and (2) regularized linear and quadratic programming (LP/QP-) ADABOOST, where the soft margin is attained by introducing slack variables. Extensive simulations demonstrate that the proposed regularized ADABOOST-type algorithms are useful and yield competitive results for noisy data.",
author = "G. R{\"a}tsch and T. Onoda and Klaus Muller",
year = "2001",
month = "3",
day = "1",
doi = "10.1023/A:1007618119488",
language = "English",
volume = "42",
pages = "287--320",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer Netherlands",
number = "3",

}

TY - JOUR

T1 - Soft margins for AdaBoost

AU - Rätsch, G.

AU - Onoda, T.

AU - Muller, Klaus

PY - 2001/3/1

Y1 - 2001/3/1

N2 - Recently ensemble methods like ADABOOST have been applied successfully in many problems, while seemingly defying the problems of overfitting. ADABOOST rarely overfits in the low noise regime, however, we show that it clearly does so for higher noise levels. Central to the understanding of this fact is the margin distribution. ADABOOST can be viewed as a constraint gradient descent in an error function with respect to the margin. We find that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors. A hard margin is clearly a sub-optimal strategy in the noisy case, and regularization, in our case a 'mistrust' in the data, must be introduced in the algorithm to alleviate the distortions that single difficult patterns (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original ADABOOST algorithm to achieve a soft margin. In particular we suggest (1) regularized ADABOOSTREG where the gradient decent is done directly with respect to the soft margin and (2) regularized linear and quadratic programming (LP/QP-) ADABOOST, where the soft margin is attained by introducing slack variables. Extensive simulations demonstrate that the proposed regularized ADABOOST-type algorithms are useful and yield competitive results for noisy data.

AB - Recently ensemble methods like ADABOOST have been applied successfully in many problems, while seemingly defying the problems of overfitting. ADABOOST rarely overfits in the low noise regime, however, we show that it clearly does so for higher noise levels. Central to the understanding of this fact is the margin distribution. ADABOOST can be viewed as a constraint gradient descent in an error function with respect to the margin. We find that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors. A hard margin is clearly a sub-optimal strategy in the noisy case, and regularization, in our case a 'mistrust' in the data, must be introduced in the algorithm to alleviate the distortions that single difficult patterns (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original ADABOOST algorithm to achieve a soft margin. In particular we suggest (1) regularized ADABOOSTREG where the gradient decent is done directly with respect to the soft margin and (2) regularized linear and quadratic programming (LP/QP-) ADABOOST, where the soft margin is attained by introducing slack variables. Extensive simulations demonstrate that the proposed regularized ADABOOST-type algorithms are useful and yield competitive results for noisy data.

UR - http://www.scopus.com/inward/record.url?scp=0342502195&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0342502195&partnerID=8YFLogxK

U2 - 10.1023/A:1007618119488

DO - 10.1023/A:1007618119488

M3 - Article

AN - SCOPUS:0342502195

VL - 42

SP - 287

EP - 320

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 3

ER -