TY - GEN
T1 - Hierarchical Reinforcement Learning using Gaussian Random Trajectory Generation in Autonomous Furniture Assembly
AU - Yun, Won Joon
AU - Mohaisen, David
AU - Jung, Soyi
AU - Kim, Jong Kook
AU - Kim, Joongheon
N1 - Funding Information:
This work was supported by Samsung Electronics (IO201208-07855-01) and also by MSIT, Korea, under ITRC (IITP-2022-2017-0-01637) supervised by IITP. The authors thank to Mr. MyungJae Shin for his contribution on research initiation, during his master study under the guidance of Prof. Joongheon Kim. Soyi Jung, Jong-Kook Kim, and Joongheon Kim are the corresponding authors.
Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - In this paper, we propose a Gaussian Random Trajectory guided Hierarchical Reinforcement Learning (GRT-HL) method for autonomous furniture assembly. The furniture assembly problem is formulated as a comprehensive human-like long-horizon manipulation task that requires a long-term planning and a sophisticated control. Our proposed model, GRT-HL, draws inspirations from the semi-supervised adversarial autoencoders, and learns latent representations of the position trajectories of the end-effector. The high-level policy generates an optimal trajectory for furniture assembly, considering the structural limitations of the robotic agents. Given the trajectory drawn from the high-level policy, the low-level policy makes a plan and controls the end-effector. We first evaluate the performance of GRT-HL compared to the state-of-the-art reinforcement learning methods in furniture assembly tasks. We demonstrate that GRT-HL successfully solves the long-horizon problem with extremely sparse rewards by generating the trajectory for planning.
AB - In this paper, we propose a Gaussian Random Trajectory guided Hierarchical Reinforcement Learning (GRT-HL) method for autonomous furniture assembly. The furniture assembly problem is formulated as a comprehensive human-like long-horizon manipulation task that requires a long-term planning and a sophisticated control. Our proposed model, GRT-HL, draws inspirations from the semi-supervised adversarial autoencoders, and learns latent representations of the position trajectories of the end-effector. The high-level policy generates an optimal trajectory for furniture assembly, considering the structural limitations of the robotic agents. Given the trajectory drawn from the high-level policy, the low-level policy makes a plan and controls the end-effector. We first evaluate the performance of GRT-HL compared to the state-of-the-art reinforcement learning methods in furniture assembly tasks. We demonstrate that GRT-HL successfully solves the long-horizon problem with extremely sparse rewards by generating the trajectory for planning.
KW - assembly control
KW - hierarchical reinforcement learning
KW - reinforcement learning
KW - robotics
UR - http://www.scopus.com/inward/record.url?scp=85140836688&partnerID=8YFLogxK
U2 - 10.1145/3511808.3557078
DO - 10.1145/3511808.3557078
M3 - Conference contribution
AN - SCOPUS:85140836688
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 3624
EP - 3633
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Y2 - 17 October 2022 through 21 October 2022
ER -