Asymptotic Bayesian generalization error when training and test distributions are different

Keisuke Yamazaki, Motoaki Kawanabe, Sumio Watanabe, Masashi Sugiyama, Klaus Muller

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

In supervised learning, we commonly assume that training and test data are sampled from the same distribution. However, this assumption can be violated in practice and then standard machine learning techniques perform poorly. This paper focuses on revealing and improving the performance of Bayesian estimation when the training and test distributions are different. We formally analyze the asymptotic Bayesian generalization error and establish its upper bound under a very general setting. Our important finding is that lower order terms - -which can be ignored in the absence of the distribution change - -play an important role under the distribution change. We also propose a novel variant of stochastic complexity which can be used for choosing an appropriate model and hyper-parameters under a particular distribution change.

Original languageEnglish
Title of host publicationACM International Conference Proceeding Series
Pages1079-1086
Number of pages8
Volume227
DOIs
Publication statusPublished - 2007 Aug 23
Externally publishedYes
Event24th International Conference on Machine Learning, ICML 2007 - Corvalis, OR, United States
Duration: 2007 Jun 202007 Jun 24

Other

Other24th International Conference on Machine Learning, ICML 2007
CountryUnited States
CityCorvalis, OR
Period07/6/2007/6/24

Fingerprint

Supervised learning
Learning systems

ASJC Scopus subject areas

  • Human-Computer Interaction

Cite this

Yamazaki, K., Kawanabe, M., Watanabe, S., Sugiyama, M., & Muller, K. (2007). Asymptotic Bayesian generalization error when training and test distributions are different. In ACM International Conference Proceeding Series (Vol. 227, pp. 1079-1086) https://doi.org/10.1145/1273496.1273632

Asymptotic Bayesian generalization error when training and test distributions are different. / Yamazaki, Keisuke; Kawanabe, Motoaki; Watanabe, Sumio; Sugiyama, Masashi; Muller, Klaus.

ACM International Conference Proceeding Series. Vol. 227 2007. p. 1079-1086.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yamazaki, K, Kawanabe, M, Watanabe, S, Sugiyama, M & Muller, K 2007, Asymptotic Bayesian generalization error when training and test distributions are different. in ACM International Conference Proceeding Series. vol. 227, pp. 1079-1086, 24th International Conference on Machine Learning, ICML 2007, Corvalis, OR, United States, 07/6/20. https://doi.org/10.1145/1273496.1273632
Yamazaki K, Kawanabe M, Watanabe S, Sugiyama M, Muller K. Asymptotic Bayesian generalization error when training and test distributions are different. In ACM International Conference Proceeding Series. Vol. 227. 2007. p. 1079-1086 https://doi.org/10.1145/1273496.1273632
Yamazaki, Keisuke ; Kawanabe, Motoaki ; Watanabe, Sumio ; Sugiyama, Masashi ; Muller, Klaus. / Asymptotic Bayesian generalization error when training and test distributions are different. ACM International Conference Proceeding Series. Vol. 227 2007. pp. 1079-1086
@inproceedings{9b9410f7e96f48c49f46ef6116bf4a44,
title = "Asymptotic Bayesian generalization error when training and test distributions are different",
abstract = "In supervised learning, we commonly assume that training and test data are sampled from the same distribution. However, this assumption can be violated in practice and then standard machine learning techniques perform poorly. This paper focuses on revealing and improving the performance of Bayesian estimation when the training and test distributions are different. We formally analyze the asymptotic Bayesian generalization error and establish its upper bound under a very general setting. Our important finding is that lower order terms - -which can be ignored in the absence of the distribution change - -play an important role under the distribution change. We also propose a novel variant of stochastic complexity which can be used for choosing an appropriate model and hyper-parameters under a particular distribution change.",
author = "Keisuke Yamazaki and Motoaki Kawanabe and Sumio Watanabe and Masashi Sugiyama and Klaus Muller",
year = "2007",
month = "8",
day = "23",
doi = "10.1145/1273496.1273632",
language = "English",
volume = "227",
pages = "1079--1086",
booktitle = "ACM International Conference Proceeding Series",

}

TY - GEN

T1 - Asymptotic Bayesian generalization error when training and test distributions are different

AU - Yamazaki, Keisuke

AU - Kawanabe, Motoaki

AU - Watanabe, Sumio

AU - Sugiyama, Masashi

AU - Muller, Klaus

PY - 2007/8/23

Y1 - 2007/8/23

N2 - In supervised learning, we commonly assume that training and test data are sampled from the same distribution. However, this assumption can be violated in practice and then standard machine learning techniques perform poorly. This paper focuses on revealing and improving the performance of Bayesian estimation when the training and test distributions are different. We formally analyze the asymptotic Bayesian generalization error and establish its upper bound under a very general setting. Our important finding is that lower order terms - -which can be ignored in the absence of the distribution change - -play an important role under the distribution change. We also propose a novel variant of stochastic complexity which can be used for choosing an appropriate model and hyper-parameters under a particular distribution change.

AB - In supervised learning, we commonly assume that training and test data are sampled from the same distribution. However, this assumption can be violated in practice and then standard machine learning techniques perform poorly. This paper focuses on revealing and improving the performance of Bayesian estimation when the training and test distributions are different. We formally analyze the asymptotic Bayesian generalization error and establish its upper bound under a very general setting. Our important finding is that lower order terms - -which can be ignored in the absence of the distribution change - -play an important role under the distribution change. We also propose a novel variant of stochastic complexity which can be used for choosing an appropriate model and hyper-parameters under a particular distribution change.

UR - http://www.scopus.com/inward/record.url?scp=34547980509&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547980509&partnerID=8YFLogxK

U2 - 10.1145/1273496.1273632

DO - 10.1145/1273496.1273632

M3 - Conference contribution

AN - SCOPUS:34547980509

VL - 227

SP - 1079

EP - 1086

BT - ACM International Conference Proceeding Series

ER -