Asynchronous action-reward learning for nonstationary serial supply chain inventory control

Chang Ouk Kim, Ick Hyun Kwon, Jun-Geol Baek

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

Action-reward learning is a reinforcement learning method. In this machine learning approach, an agent interacts with non-deterministic control domain. The agent selects actions at decision epochs and the control domain gives rise to rewards with which the performance measures of the actions are updated. The objective of the agent is to select the future best actions based on the updated performance measures. In this paper, we develop an asynchronous action-reward learning model which updates the performance measures of actions faster than conventional action-reward learning. This learning model is suitable to apply to nonstationary control domain where the rewards for actions vary over time. Based on the asynchronous action-reward learning, two situation reactive inventory control models (centralized and decentralized models) are proposed for a two-stage serial supply chain with nonstationary customer demand. A simulation based experiment was performed to evaluate the performance of the proposed two models.

Original languageEnglish
Pages (from-to)1-16
Number of pages16
JournalApplied Intelligence
Volume28
Issue number1
DOIs
Publication statusPublished - 2008 Feb 1
Externally publishedYes

Fingerprint

Inventory control
Supply chains
Reinforcement learning
Learning systems
Experiments

Keywords

  • Action reward learning
  • Asynchronous performance measure update
  • Machine learning
  • Nonstationary customer demand
  • Situation reactive inventory control
  • Two-stage serial supply chain

ASJC Scopus subject areas

  • Artificial Intelligence
  • Control and Systems Engineering

Cite this

Asynchronous action-reward learning for nonstationary serial supply chain inventory control. / Kim, Chang Ouk; Kwon, Ick Hyun; Baek, Jun-Geol.

In: Applied Intelligence, Vol. 28, No. 1, 01.02.2008, p. 1-16.

Research output: Contribution to journalArticle

@article{987db1f931514dc7825b5f0db6d0deb1,
title = "Asynchronous action-reward learning for nonstationary serial supply chain inventory control",
abstract = "Action-reward learning is a reinforcement learning method. In this machine learning approach, an agent interacts with non-deterministic control domain. The agent selects actions at decision epochs and the control domain gives rise to rewards with which the performance measures of the actions are updated. The objective of the agent is to select the future best actions based on the updated performance measures. In this paper, we develop an asynchronous action-reward learning model which updates the performance measures of actions faster than conventional action-reward learning. This learning model is suitable to apply to nonstationary control domain where the rewards for actions vary over time. Based on the asynchronous action-reward learning, two situation reactive inventory control models (centralized and decentralized models) are proposed for a two-stage serial supply chain with nonstationary customer demand. A simulation based experiment was performed to evaluate the performance of the proposed two models.",
keywords = "Action reward learning, Asynchronous performance measure update, Machine learning, Nonstationary customer demand, Situation reactive inventory control, Two-stage serial supply chain",
author = "Kim, {Chang Ouk} and Kwon, {Ick Hyun} and Jun-Geol Baek",
year = "2008",
month = "2",
day = "1",
doi = "10.1007/s10489-007-0038-2",
language = "English",
volume = "28",
pages = "1--16",
journal = "Applied Intelligence",
issn = "0924-669X",
publisher = "Springer Netherlands",
number = "1",

}

TY - JOUR

T1 - Asynchronous action-reward learning for nonstationary serial supply chain inventory control

AU - Kim, Chang Ouk

AU - Kwon, Ick Hyun

AU - Baek, Jun-Geol

PY - 2008/2/1

Y1 - 2008/2/1

N2 - Action-reward learning is a reinforcement learning method. In this machine learning approach, an agent interacts with non-deterministic control domain. The agent selects actions at decision epochs and the control domain gives rise to rewards with which the performance measures of the actions are updated. The objective of the agent is to select the future best actions based on the updated performance measures. In this paper, we develop an asynchronous action-reward learning model which updates the performance measures of actions faster than conventional action-reward learning. This learning model is suitable to apply to nonstationary control domain where the rewards for actions vary over time. Based on the asynchronous action-reward learning, two situation reactive inventory control models (centralized and decentralized models) are proposed for a two-stage serial supply chain with nonstationary customer demand. A simulation based experiment was performed to evaluate the performance of the proposed two models.

AB - Action-reward learning is a reinforcement learning method. In this machine learning approach, an agent interacts with non-deterministic control domain. The agent selects actions at decision epochs and the control domain gives rise to rewards with which the performance measures of the actions are updated. The objective of the agent is to select the future best actions based on the updated performance measures. In this paper, we develop an asynchronous action-reward learning model which updates the performance measures of actions faster than conventional action-reward learning. This learning model is suitable to apply to nonstationary control domain where the rewards for actions vary over time. Based on the asynchronous action-reward learning, two situation reactive inventory control models (centralized and decentralized models) are proposed for a two-stage serial supply chain with nonstationary customer demand. A simulation based experiment was performed to evaluate the performance of the proposed two models.

KW - Action reward learning

KW - Asynchronous performance measure update

KW - Machine learning

KW - Nonstationary customer demand

KW - Situation reactive inventory control

KW - Two-stage serial supply chain

UR - http://www.scopus.com/inward/record.url?scp=37649013265&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=37649013265&partnerID=8YFLogxK

U2 - 10.1007/s10489-007-0038-2

DO - 10.1007/s10489-007-0038-2

M3 - Article

VL - 28

SP - 1

EP - 16

JO - Applied Intelligence

JF - Applied Intelligence

SN - 0924-669X

IS - 1

ER -