Asymptotic statistical theory of overtraining and cross-validation

Shun Ichi Amari, Noboru Murata, Klaus Muller, Michael Finke, Howard Hua Yang

Research output: Contribution to journalArticle

236 Citations (Scopus)

Abstract

A statistical theory for overtraining is proposed. The analysis treats general realizable stochastic neural networks, trained with Kullback-Leibler divergence in the asymptotic case of a large number of training examples. It is shown that the asymptotic gain in the generalization error is small if we perform early stopping, even if we have access to the optimal stopping time. Considering cross-validation stopping we answer the question: In what ratio the examples should be divided into training and cross-validation sets in order to obtain the optimum performance. Although cross-validated early stopping is useless in the asymptotic region, it surely decreases the generalization error in the nonasymptotic region. Our large scale simulations done on a CM5 are in nice agreement with our analytical findings.

Original languageEnglish
Pages (from-to)985-996
Number of pages12
JournalIEEE Transactions on Neural Networks
Volume8
Issue number5
DOIs
Publication statusPublished - 1997 Dec 1
Externally publishedYes

    Fingerprint

Keywords

  • Asymptotic analysis
  • Cross-validation
  • Early stopping
  • Generalization
  • Overtraining
  • Stochastic neural networks

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Theoretical Computer Science
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Hardware and Architecture

Cite this

Amari, S. I., Murata, N., Muller, K., Finke, M., & Yang, H. H. (1997). Asymptotic statistical theory of overtraining and cross-validation. IEEE Transactions on Neural Networks, 8(5), 985-996. https://doi.org/10.1109/72.623200