Abstract
This paper proposes a deep Q network (DQN)-based method for the workload partition problem in OpenCL. The DQN, a reinforcement learning algorithm, optimizes the workload partition for each processing unit by the self-training, based on the accumulated performance data on the computing environment. Our experiments reveal that the DQN-based partition provides the performance improvement by up to 62.2% and 6.9% in JPEG decoding, compared to the LuxMark-based and target-based partitions, respectively. The DQN is able to capture the low-level contention in slave devices such as caches and memory, and the communication bottleneck between devices, and reflect it to the workload partition ratio.
Original language | English |
---|---|
Pages (from-to) | 4875-4893 |
Number of pages | 19 |
Journal | Journal of Supercomputing |
Volume | 75 |
Issue number | 8 |
DOIs | |
Publication status | Published - 2019 Aug 1 |
Keywords
- DQN
- OpenCL
- Workload partition
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Information Systems
- Hardware and Architecture