Trading Variance Reduction with Unbiasedness: The Regularized Subspace Information Criterion for Robust Model Selection in Kernel Regression

Masashi Sugiyama, Motoaki Kawanabe, Klaus Muller

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hubert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results.

Original languageEnglish
Pages (from-to)1077-1104
Number of pages28
JournalNeural Computation
Volume16
Issue number5
DOIs
Publication statusPublished - 2004 May 1
Externally publishedYes

Fingerprint

Bayes Theorem
Computer Simulation
Patient Selection
Noise
Learning
Stabilization
Kernel
Model Selection
Computer simulation
Experiments
Datasets

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Neuroscience(all)

Cite this

Trading Variance Reduction with Unbiasedness : The Regularized Subspace Information Criterion for Robust Model Selection in Kernel Regression. / Sugiyama, Masashi; Kawanabe, Motoaki; Muller, Klaus.

In: Neural Computation, Vol. 16, No. 5, 01.05.2004, p. 1077-1104.

Research output: Contribution to journalArticle

@article{c3b72ec544224c68bae21bb7d5a9db7c,
title = "Trading Variance Reduction with Unbiasedness: The Regularized Subspace Information Criterion for Robust Model Selection in Kernel Regression",
abstract = "A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hubert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results.",
author = "Masashi Sugiyama and Motoaki Kawanabe and Klaus Muller",
year = "2004",
month = "5",
day = "1",
doi = "10.1162/089976604773135113",
language = "English",
volume = "16",
pages = "1077--1104",
journal = "Neural Computation",
issn = "0899-7667",
publisher = "MIT Press Journals",
number = "5",

}

TY - JOUR

T1 - Trading Variance Reduction with Unbiasedness

T2 - The Regularized Subspace Information Criterion for Robust Model Selection in Kernel Regression

AU - Sugiyama, Masashi

AU - Kawanabe, Motoaki

AU - Muller, Klaus

PY - 2004/5/1

Y1 - 2004/5/1

N2 - A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hubert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results.

AB - A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hubert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results.

UR - http://www.scopus.com/inward/record.url?scp=1842733198&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1842733198&partnerID=8YFLogxK

U2 - 10.1162/089976604773135113

DO - 10.1162/089976604773135113

M3 - Article

C2 - 15070511

AN - SCOPUS:1842733198

VL - 16

SP - 1077

EP - 1104

JO - Neural Computation

JF - Neural Computation

SN - 0899-7667

IS - 5

ER -