Regularization of case-specific parameters for robustness and efficiency

Yoonkyung Lee, Steven N. MacEachern, Yoonsuh Jung

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Regularization methods allow one to handle a variety of inferential problems where there are more covariates than cases. This allows one to consider a potentially enormous number of covariates for a problem. We exploit the power of these techniques, supersaturating models by augmenting the "natural" covariates in the problem with an additional indicator for each case in the data set. We attach a penalty term for these case-specific indicators which is designed to produce a desired effect. For regression methods with squared error loss, an ℓ1 penalty produces a regression which is robust to outliers and high leverage cases; for quantile regression methods, an ℓ2 penalty decreases the variance of the fit enough to overcome an increase in bias. The paradigm thus allows us to robustify procedures which lack robustness and to increase the efficiency of procedures which are robust. We provide a general framework for the inclusion of case-specific parameters in regularization problems, describing the impact on the effective loss for a variety of regression and classification problems. We outline a computational strategy by which existing software can be modified to solve the augmented regularization problem, providing conditions under which such modification will converge to the optimum solution. We illustrate the benefits of including case-specific parameters in the context of mean regression and quantile regression through analysis of NHANES and linguistic data sets.

Original languageEnglish
Pages (from-to)350-372
Number of pages23
JournalStatistical Science
Volume27
Issue number3
DOIs
Publication statusPublished - 2012 Dec 31
Externally publishedYes

Fingerprint

Regularization
Robustness
Regression
Penalty
Covariates
Quantile Regression
Squared Error Loss
Regularization Method
Leverage
Classification Problems
Outlier
Inclusion
Paradigm
Converge
Decrease
Software
Term
Quantile regression
Regression method
Outliers

Keywords

  • Case indicator
  • Large margin classifier
  • Lasso
  • Leverage point
  • Outlier
  • Penalized method
  • Quantile regression

ASJC Scopus subject areas

  • Statistics and Probability
  • Mathematics(all)
  • Statistics, Probability and Uncertainty

Cite this

Regularization of case-specific parameters for robustness and efficiency. / Lee, Yoonkyung; MacEachern, Steven N.; Jung, Yoonsuh.

In: Statistical Science, Vol. 27, No. 3, 31.12.2012, p. 350-372.

Research output: Contribution to journalArticle

Lee, Yoonkyung ; MacEachern, Steven N. ; Jung, Yoonsuh. / Regularization of case-specific parameters for robustness and efficiency. In: Statistical Science. 2012 ; Vol. 27, No. 3. pp. 350-372.
@article{d6149ea588c844a29b7702e46ac5cfea,
title = "Regularization of case-specific parameters for robustness and efficiency",
abstract = "Regularization methods allow one to handle a variety of inferential problems where there are more covariates than cases. This allows one to consider a potentially enormous number of covariates for a problem. We exploit the power of these techniques, supersaturating models by augmenting the {"}natural{"} covariates in the problem with an additional indicator for each case in the data set. We attach a penalty term for these case-specific indicators which is designed to produce a desired effect. For regression methods with squared error loss, an ℓ1 penalty produces a regression which is robust to outliers and high leverage cases; for quantile regression methods, an ℓ2 penalty decreases the variance of the fit enough to overcome an increase in bias. The paradigm thus allows us to robustify procedures which lack robustness and to increase the efficiency of procedures which are robust. We provide a general framework for the inclusion of case-specific parameters in regularization problems, describing the impact on the effective loss for a variety of regression and classification problems. We outline a computational strategy by which existing software can be modified to solve the augmented regularization problem, providing conditions under which such modification will converge to the optimum solution. We illustrate the benefits of including case-specific parameters in the context of mean regression and quantile regression through analysis of NHANES and linguistic data sets.",
keywords = "Case indicator, Large margin classifier, Lasso, Leverage point, Outlier, Penalized method, Quantile regression",
author = "Yoonkyung Lee and MacEachern, {Steven N.} and Yoonsuh Jung",
year = "2012",
month = "12",
day = "31",
doi = "10.1214/11-STS377",
language = "English",
volume = "27",
pages = "350--372",
journal = "Statistical Science",
issn = "0883-4237",
publisher = "Institute of Mathematical Statistics",
number = "3",

}

TY - JOUR

T1 - Regularization of case-specific parameters for robustness and efficiency

AU - Lee, Yoonkyung

AU - MacEachern, Steven N.

AU - Jung, Yoonsuh

PY - 2012/12/31

Y1 - 2012/12/31

N2 - Regularization methods allow one to handle a variety of inferential problems where there are more covariates than cases. This allows one to consider a potentially enormous number of covariates for a problem. We exploit the power of these techniques, supersaturating models by augmenting the "natural" covariates in the problem with an additional indicator for each case in the data set. We attach a penalty term for these case-specific indicators which is designed to produce a desired effect. For regression methods with squared error loss, an ℓ1 penalty produces a regression which is robust to outliers and high leverage cases; for quantile regression methods, an ℓ2 penalty decreases the variance of the fit enough to overcome an increase in bias. The paradigm thus allows us to robustify procedures which lack robustness and to increase the efficiency of procedures which are robust. We provide a general framework for the inclusion of case-specific parameters in regularization problems, describing the impact on the effective loss for a variety of regression and classification problems. We outline a computational strategy by which existing software can be modified to solve the augmented regularization problem, providing conditions under which such modification will converge to the optimum solution. We illustrate the benefits of including case-specific parameters in the context of mean regression and quantile regression through analysis of NHANES and linguistic data sets.

AB - Regularization methods allow one to handle a variety of inferential problems where there are more covariates than cases. This allows one to consider a potentially enormous number of covariates for a problem. We exploit the power of these techniques, supersaturating models by augmenting the "natural" covariates in the problem with an additional indicator for each case in the data set. We attach a penalty term for these case-specific indicators which is designed to produce a desired effect. For regression methods with squared error loss, an ℓ1 penalty produces a regression which is robust to outliers and high leverage cases; for quantile regression methods, an ℓ2 penalty decreases the variance of the fit enough to overcome an increase in bias. The paradigm thus allows us to robustify procedures which lack robustness and to increase the efficiency of procedures which are robust. We provide a general framework for the inclusion of case-specific parameters in regularization problems, describing the impact on the effective loss for a variety of regression and classification problems. We outline a computational strategy by which existing software can be modified to solve the augmented regularization problem, providing conditions under which such modification will converge to the optimum solution. We illustrate the benefits of including case-specific parameters in the context of mean regression and quantile regression through analysis of NHANES and linguistic data sets.

KW - Case indicator

KW - Large margin classifier

KW - Lasso

KW - Leverage point

KW - Outlier

KW - Penalized method

KW - Quantile regression

UR - http://www.scopus.com/inward/record.url?scp=84871555813&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84871555813&partnerID=8YFLogxK

U2 - 10.1214/11-STS377

DO - 10.1214/11-STS377

M3 - Article

VL - 27

SP - 350

EP - 372

JO - Statistical Science

JF - Statistical Science

SN - 0883-4237

IS - 3

ER -