Runtime parallelization of legacy code on a transactional memory system

Matthew DeVuyst, Dean M. Tullsen, Seon Wook Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

This paper proposes a new runtime parallelization technique, based on a dynamic optimization framework, to automatically parallelize single-threaded legacy programs. It heavily leverages the optimistic concurrency of transactional memory. This work addresses a number of challenges posed by this type of parallelization and quantifies the trade-offs of some of the design decisions, such as how to select good loops for parallelization, how to partition the iteration space among parallel threads, how to handle loop-carried dependencies, and how to transition from serial to parallel execution and back. The simulated implementation of runtime parallelization shows a potential speedup of 1.36 for the NAS benchmarks and a 1.34 speedup for the SPEC 2000 CPU floating point benchmarks when using two cores for parallel execution.

Original languageEnglish
Title of host publicationHiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Pages127-136
Number of pages10
DOIs
Publication statusPublished - 2011 Mar 28
Event6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC'11 - Heraklion, Crete, Greece
Duration: 2011 Jan 242011 Jan 26

Other

Other6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC'11
CountryGreece
CityHeraklion, Crete
Period11/1/2411/1/26

Fingerprint

Program processors
Data storage equipment

Keywords

  • Dynamic optimization
  • Parallelization
  • Transactional memory

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Electrical and Electronic Engineering

Cite this

DeVuyst, M., Tullsen, D. M., & Kim, S. W. (2011). Runtime parallelization of legacy code on a transactional memory system. In HiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers (pp. 127-136) https://doi.org/10.1145/1944862.1944882

Runtime parallelization of legacy code on a transactional memory system. / DeVuyst, Matthew; Tullsen, Dean M.; Kim, Seon Wook.

HiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. 2011. p. 127-136.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DeVuyst, M, Tullsen, DM & Kim, SW 2011, Runtime parallelization of legacy code on a transactional memory system. in HiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. pp. 127-136, 6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC'11, Heraklion, Crete, Greece, 11/1/24. https://doi.org/10.1145/1944862.1944882
DeVuyst M, Tullsen DM, Kim SW. Runtime parallelization of legacy code on a transactional memory system. In HiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. 2011. p. 127-136 https://doi.org/10.1145/1944862.1944882
DeVuyst, Matthew ; Tullsen, Dean M. ; Kim, Seon Wook. / Runtime parallelization of legacy code on a transactional memory system. HiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. 2011. pp. 127-136
@inproceedings{a4400f9e6e7a49ea8c10464eb528f1e6,
title = "Runtime parallelization of legacy code on a transactional memory system",
abstract = "This paper proposes a new runtime parallelization technique, based on a dynamic optimization framework, to automatically parallelize single-threaded legacy programs. It heavily leverages the optimistic concurrency of transactional memory. This work addresses a number of challenges posed by this type of parallelization and quantifies the trade-offs of some of the design decisions, such as how to select good loops for parallelization, how to partition the iteration space among parallel threads, how to handle loop-carried dependencies, and how to transition from serial to parallel execution and back. The simulated implementation of runtime parallelization shows a potential speedup of 1.36 for the NAS benchmarks and a 1.34 speedup for the SPEC 2000 CPU floating point benchmarks when using two cores for parallel execution.",
keywords = "Dynamic optimization, Parallelization, Transactional memory",
author = "Matthew DeVuyst and Tullsen, {Dean M.} and Kim, {Seon Wook}",
year = "2011",
month = "3",
day = "28",
doi = "10.1145/1944862.1944882",
language = "English",
isbn = "9781450302418",
pages = "127--136",
booktitle = "HiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers",

}

TY - GEN

T1 - Runtime parallelization of legacy code on a transactional memory system

AU - DeVuyst, Matthew

AU - Tullsen, Dean M.

AU - Kim, Seon Wook

PY - 2011/3/28

Y1 - 2011/3/28

N2 - This paper proposes a new runtime parallelization technique, based on a dynamic optimization framework, to automatically parallelize single-threaded legacy programs. It heavily leverages the optimistic concurrency of transactional memory. This work addresses a number of challenges posed by this type of parallelization and quantifies the trade-offs of some of the design decisions, such as how to select good loops for parallelization, how to partition the iteration space among parallel threads, how to handle loop-carried dependencies, and how to transition from serial to parallel execution and back. The simulated implementation of runtime parallelization shows a potential speedup of 1.36 for the NAS benchmarks and a 1.34 speedup for the SPEC 2000 CPU floating point benchmarks when using two cores for parallel execution.

AB - This paper proposes a new runtime parallelization technique, based on a dynamic optimization framework, to automatically parallelize single-threaded legacy programs. It heavily leverages the optimistic concurrency of transactional memory. This work addresses a number of challenges posed by this type of parallelization and quantifies the trade-offs of some of the design decisions, such as how to select good loops for parallelization, how to partition the iteration space among parallel threads, how to handle loop-carried dependencies, and how to transition from serial to parallel execution and back. The simulated implementation of runtime parallelization shows a potential speedup of 1.36 for the NAS benchmarks and a 1.34 speedup for the SPEC 2000 CPU floating point benchmarks when using two cores for parallel execution.

KW - Dynamic optimization

KW - Parallelization

KW - Transactional memory

UR - http://www.scopus.com/inward/record.url?scp=79952938772&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952938772&partnerID=8YFLogxK

U2 - 10.1145/1944862.1944882

DO - 10.1145/1944862.1944882

M3 - Conference contribution

AN - SCOPUS:79952938772

SN - 9781450302418

SP - 127

EP - 136

BT - HiPEAC'11 - Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers

ER -