A K-fold averaging cross-validation procedure

Yoonsuh Jung, Jianhua Hu

Research output: Contribution to journalArticle

39 Citations (Scopus)

Abstract

Cross-validation (CV) type of methods have been widely used to facilitate model estimation and variable selection. In this work, we suggest a new K-fold CV procedure to select a candidate ‘optimal’ model from each hold-out fold and average the K candidate ‘optimal’ models to obtain the ultimate model. Due to the averaging effect, the variance of the proposed estimates can be significantly reduced. This new procedure results in more stable and efficient parameter estimation than the classical K-fold CV procedure. In addition, we show the asymptotic equivalence between the proposed and classical CV procedures in the linear regression setting. We also demonstrate the broad applicability of the proposed procedure via two examples of parameter sparsity regularisation and quantile smoothing splines modelling. We illustrate the promise of the proposed method through simulations and a real data example.

Original languageEnglish
Pages (from-to)167-179
Number of pages13
JournalJournal of Nonparametric Statistics
Volume27
Issue number2
DOIs
Publication statusPublished - 2015 Apr 3
Externally publishedYes

Fingerprint

Cross-validation
Averaging
Fold
Asymptotic Equivalence
Smoothing Splines
Efficient Estimation
Variable Selection
Quantile
Sparsity
Linear regression
Model
Parameter Estimation
Regularization
Modeling
Estimate
Demonstrate
Simulation

Keywords

  • cross-validation
  • model averaging
  • model selection

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

A K-fold averaging cross-validation procedure. / Jung, Yoonsuh; Hu, Jianhua.

In: Journal of Nonparametric Statistics, Vol. 27, No. 2, 03.04.2015, p. 167-179.

Research output: Contribution to journalArticle

@article{4b7d1d929b1045b4bd69e4b14ffa6c49,
title = "A K-fold averaging cross-validation procedure",
abstract = "Cross-validation (CV) type of methods have been widely used to facilitate model estimation and variable selection. In this work, we suggest a new K-fold CV procedure to select a candidate ‘optimal’ model from each hold-out fold and average the K candidate ‘optimal’ models to obtain the ultimate model. Due to the averaging effect, the variance of the proposed estimates can be significantly reduced. This new procedure results in more stable and efficient parameter estimation than the classical K-fold CV procedure. In addition, we show the asymptotic equivalence between the proposed and classical CV procedures in the linear regression setting. We also demonstrate the broad applicability of the proposed procedure via two examples of parameter sparsity regularisation and quantile smoothing splines modelling. We illustrate the promise of the proposed method through simulations and a real data example.",
keywords = "cross-validation, model averaging, model selection",
author = "Yoonsuh Jung and Jianhua Hu",
year = "2015",
month = "4",
day = "3",
doi = "10.1080/10485252.2015.1010532",
language = "English",
volume = "27",
pages = "167--179",
journal = "Journal of Nonparametric Statistics",
issn = "1048-5252",
publisher = "Taylor and Francis Ltd.",
number = "2",

}

TY - JOUR

T1 - A K-fold averaging cross-validation procedure

AU - Jung, Yoonsuh

AU - Hu, Jianhua

PY - 2015/4/3

Y1 - 2015/4/3

N2 - Cross-validation (CV) type of methods have been widely used to facilitate model estimation and variable selection. In this work, we suggest a new K-fold CV procedure to select a candidate ‘optimal’ model from each hold-out fold and average the K candidate ‘optimal’ models to obtain the ultimate model. Due to the averaging effect, the variance of the proposed estimates can be significantly reduced. This new procedure results in more stable and efficient parameter estimation than the classical K-fold CV procedure. In addition, we show the asymptotic equivalence between the proposed and classical CV procedures in the linear regression setting. We also demonstrate the broad applicability of the proposed procedure via two examples of parameter sparsity regularisation and quantile smoothing splines modelling. We illustrate the promise of the proposed method through simulations and a real data example.

AB - Cross-validation (CV) type of methods have been widely used to facilitate model estimation and variable selection. In this work, we suggest a new K-fold CV procedure to select a candidate ‘optimal’ model from each hold-out fold and average the K candidate ‘optimal’ models to obtain the ultimate model. Due to the averaging effect, the variance of the proposed estimates can be significantly reduced. This new procedure results in more stable and efficient parameter estimation than the classical K-fold CV procedure. In addition, we show the asymptotic equivalence between the proposed and classical CV procedures in the linear regression setting. We also demonstrate the broad applicability of the proposed procedure via two examples of parameter sparsity regularisation and quantile smoothing splines modelling. We illustrate the promise of the proposed method through simulations and a real data example.

KW - cross-validation

KW - model averaging

KW - model selection

UR - http://www.scopus.com/inward/record.url?scp=84929268805&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84929268805&partnerID=8YFLogxK

U2 - 10.1080/10485252.2015.1010532

DO - 10.1080/10485252.2015.1010532

M3 - Article

AN - SCOPUS:84929268805

VL - 27

SP - 167

EP - 179

JO - Journal of Nonparametric Statistics

JF - Journal of Nonparametric Statistics

SN - 1048-5252

IS - 2

ER -