TY - JOUR
T1 - Hierarchical End-to-end Control Policy for Multi-degree-of-freedom Manipulators
AU - Min, Cheol Hui
AU - Song, Jae Bok
N1 - Funding Information:
This research was supported by the MOTIE under the Industrial Foundation Technology Development Program supervised by the KEIT (No. 20008613).
Publisher Copyright:
© 2022, ICROS, KIEE and Springer.
PY - 2022/10
Y1 - 2022/10
N2 - In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy function learning. In this study, the control problem is addressed by in-troducing a hierarchical reinforcement learning method that can learn the end-to-end control policy of a multi-DOF manipula-tor without any constraints on the state-action space. The proposed method learns hierarchical policy using two off-policy methods. Using human demonstration data and a newly proposed data-correction method, controlling the multi-DOF manipu-lator in an end-to-end manner is shown to outperform the non-hierarchical deep reinforcement learning methods.
AB - In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy function learning. In this study, the control problem is addressed by in-troducing a hierarchical reinforcement learning method that can learn the end-to-end control policy of a multi-DOF manipula-tor without any constraints on the state-action space. The proposed method learns hierarchical policy using two off-policy methods. Using human demonstration data and a newly proposed data-correction method, controlling the multi-DOF manipu-lator in an end-to-end manner is shown to outperform the non-hierarchical deep reinforcement learning methods.
KW - Deep reinforcement learning
KW - demonstration-based learning
KW - end-to-end robot control
KW - hierarchical reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85137046547&partnerID=8YFLogxK
U2 - 10.1007/s12555-021-0511-4
DO - 10.1007/s12555-021-0511-4
M3 - Article
AN - SCOPUS:85137046547
VL - 20
SP - 3296
EP - 3311
JO - International Journal of Control, Automation and Systems
JF - International Journal of Control, Automation and Systems
SN - 1598-6446
IS - 10
ER -