Hardware-based job queue management for manycore architectures and openMP environments

Junghee Lee, Chrysostomos Nicopoulos, Yongjae Lee, Hyung Gyu Lee, Jongman Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

The seemingly interminable dwindle of technology feature sizes well into the nano-scale regime has afforded computer architects with an abundance of computational resources on a single chip. The Chip Multi-Processor (CMP) paradigm is now seen as the de facto architecture for years to come. However, in order to efficiently exploit the increasing number of on-chip processing cores, it is imperative to achieve and maintain efficient utilization of the resources at run time. Uneven and skewed distribution of workloads misuses the CMP resources and may even lead to such undesired effects as traffic and temperature hotspots. While existing techniques rely mostly on software for the undertaking of load balancing duties and exploit hardware mainly for synchronization, we will demonstrate that there are wider opportunities for hardware support of load balancing in CMP systems. Based on this fact, this paper proposes IsoNet, a conflict-free dynamic load distribution engine that exploits hardware aggressively to reinforce massively parallel computation in many core settings. Moreover, the proposed architecture provides extensive fault-tolerance against both CPU faults and intra-IsoNet faults. The hardware takes charge of both (1) the management of the list of jobs to be executed, and (2) the transfer of jobs between processing elements to maintain load balance. Experimental results show that, unlike the existing popular techniques of blocking and job stealing, IsoNet is scalable with as many as 1024 processing cores.

Original languageEnglish
Title of host publicationProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
Pages407-418
Number of pages12
DOIs
Publication statusPublished - 2011 Oct 3
Externally publishedYes
Event25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011 - Anchorage, AK, United States
Duration: 2011 May 162011 May 20

Publication series

NameProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

Conference

Conference25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
CountryUnited States
CityAnchorage, AK
Period11/5/1611/5/20

Fingerprint

Hardware
Resource allocation
Processing
Dynamic loads
Fault tolerance
Computer hardware
Program processors
Synchronization
Engines
Temperature

Keywords

  • fault-tolerant
  • job queue
  • manycore
  • OpenMP

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications

Cite this

Lee, J., Nicopoulos, C., Lee, Y., Lee, H. G., & Kim, J. (2011). Hardware-based job queue management for manycore architectures and openMP environments. In Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011 (pp. 407-418). [6012811] (Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011). https://doi.org/10.1109/IPDPS.2011.47

Hardware-based job queue management for manycore architectures and openMP environments. / Lee, Junghee; Nicopoulos, Chrysostomos; Lee, Yongjae; Lee, Hyung Gyu; Kim, Jongman.

Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011. 2011. p. 407-418 6012811 (Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, J, Nicopoulos, C, Lee, Y, Lee, HG & Kim, J 2011, Hardware-based job queue management for manycore architectures and openMP environments. in Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011., 6012811, Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011, pp. 407-418, 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011, Anchorage, AK, United States, 11/5/16. https://doi.org/10.1109/IPDPS.2011.47
Lee J, Nicopoulos C, Lee Y, Lee HG, Kim J. Hardware-based job queue management for manycore architectures and openMP environments. In Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011. 2011. p. 407-418. 6012811. (Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011). https://doi.org/10.1109/IPDPS.2011.47
Lee, Junghee ; Nicopoulos, Chrysostomos ; Lee, Yongjae ; Lee, Hyung Gyu ; Kim, Jongman. / Hardware-based job queue management for manycore architectures and openMP environments. Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011. 2011. pp. 407-418 (Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011).
@inproceedings{1383995c728f48098af951c99e719020,
title = "Hardware-based job queue management for manycore architectures and openMP environments",
abstract = "The seemingly interminable dwindle of technology feature sizes well into the nano-scale regime has afforded computer architects with an abundance of computational resources on a single chip. The Chip Multi-Processor (CMP) paradigm is now seen as the de facto architecture for years to come. However, in order to efficiently exploit the increasing number of on-chip processing cores, it is imperative to achieve and maintain efficient utilization of the resources at run time. Uneven and skewed distribution of workloads misuses the CMP resources and may even lead to such undesired effects as traffic and temperature hotspots. While existing techniques rely mostly on software for the undertaking of load balancing duties and exploit hardware mainly for synchronization, we will demonstrate that there are wider opportunities for hardware support of load balancing in CMP systems. Based on this fact, this paper proposes IsoNet, a conflict-free dynamic load distribution engine that exploits hardware aggressively to reinforce massively parallel computation in many core settings. Moreover, the proposed architecture provides extensive fault-tolerance against both CPU faults and intra-IsoNet faults. The hardware takes charge of both (1) the management of the list of jobs to be executed, and (2) the transfer of jobs between processing elements to maintain load balance. Experimental results show that, unlike the existing popular techniques of blocking and job stealing, IsoNet is scalable with as many as 1024 processing cores.",
keywords = "fault-tolerant, job queue, manycore, OpenMP",
author = "Junghee Lee and Chrysostomos Nicopoulos and Yongjae Lee and Lee, {Hyung Gyu} and Jongman Kim",
year = "2011",
month = "10",
day = "3",
doi = "10.1109/IPDPS.2011.47",
language = "English",
isbn = "9780769543857",
series = "Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011",
pages = "407--418",
booktitle = "Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011",

}

TY - GEN

T1 - Hardware-based job queue management for manycore architectures and openMP environments

AU - Lee, Junghee

AU - Nicopoulos, Chrysostomos

AU - Lee, Yongjae

AU - Lee, Hyung Gyu

AU - Kim, Jongman

PY - 2011/10/3

Y1 - 2011/10/3

N2 - The seemingly interminable dwindle of technology feature sizes well into the nano-scale regime has afforded computer architects with an abundance of computational resources on a single chip. The Chip Multi-Processor (CMP) paradigm is now seen as the de facto architecture for years to come. However, in order to efficiently exploit the increasing number of on-chip processing cores, it is imperative to achieve and maintain efficient utilization of the resources at run time. Uneven and skewed distribution of workloads misuses the CMP resources and may even lead to such undesired effects as traffic and temperature hotspots. While existing techniques rely mostly on software for the undertaking of load balancing duties and exploit hardware mainly for synchronization, we will demonstrate that there are wider opportunities for hardware support of load balancing in CMP systems. Based on this fact, this paper proposes IsoNet, a conflict-free dynamic load distribution engine that exploits hardware aggressively to reinforce massively parallel computation in many core settings. Moreover, the proposed architecture provides extensive fault-tolerance against both CPU faults and intra-IsoNet faults. The hardware takes charge of both (1) the management of the list of jobs to be executed, and (2) the transfer of jobs between processing elements to maintain load balance. Experimental results show that, unlike the existing popular techniques of blocking and job stealing, IsoNet is scalable with as many as 1024 processing cores.

AB - The seemingly interminable dwindle of technology feature sizes well into the nano-scale regime has afforded computer architects with an abundance of computational resources on a single chip. The Chip Multi-Processor (CMP) paradigm is now seen as the de facto architecture for years to come. However, in order to efficiently exploit the increasing number of on-chip processing cores, it is imperative to achieve and maintain efficient utilization of the resources at run time. Uneven and skewed distribution of workloads misuses the CMP resources and may even lead to such undesired effects as traffic and temperature hotspots. While existing techniques rely mostly on software for the undertaking of load balancing duties and exploit hardware mainly for synchronization, we will demonstrate that there are wider opportunities for hardware support of load balancing in CMP systems. Based on this fact, this paper proposes IsoNet, a conflict-free dynamic load distribution engine that exploits hardware aggressively to reinforce massively parallel computation in many core settings. Moreover, the proposed architecture provides extensive fault-tolerance against both CPU faults and intra-IsoNet faults. The hardware takes charge of both (1) the management of the list of jobs to be executed, and (2) the transfer of jobs between processing elements to maintain load balance. Experimental results show that, unlike the existing popular techniques of blocking and job stealing, IsoNet is scalable with as many as 1024 processing cores.

KW - fault-tolerant

KW - job queue

KW - manycore

KW - OpenMP

UR - http://www.scopus.com/inward/record.url?scp=80053233180&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053233180&partnerID=8YFLogxK

U2 - 10.1109/IPDPS.2011.47

DO - 10.1109/IPDPS.2011.47

M3 - Conference contribution

AN - SCOPUS:80053233180

SN - 9780769543857

T3 - Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

SP - 407

EP - 418

BT - Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

ER -