Load-balanced parallel merge sort on distributed memory parallel computers

Minsoo Jeon, Dong Seung Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Sort can be speeded up on parallel computers by dividing and computing data individually in parallel. Merge sort can be parallelized, however, the conventional algorithm implemented on distributed memory computers has poor performance due to the successive reduction of the number of active (non-idling) processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort algorithm where all processors participate in merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in merging phase. Significant enhancement of the performance has been achieved. Our analysis shows the upper bound of the speedup of the merge time as (P - 1)/ log P. We have had a speedup of 9.6 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers. The same idea can be applied to parallellize other sorting algorithms.

Original languageEnglish
Title of host publicationProceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages248
Number of pages1
ISBN (Print)0769515738, 9780769515731
DOIs
Publication statusPublished - 2002
Event16th International Parallel and Distributed Processing Symposium, IPDPS 2002 - Ft. Lauderdale, United States
Duration: 2002 Apr 152002 Apr 19

Other

Other16th International Parallel and Distributed Processing Symposium, IPDPS 2002
CountryUnited States
CityFt. Lauderdale
Period02/4/1502/4/19

Fingerprint

Distributed Memory
Parallel Computers
Merging
Sort
Sorting
Data storage equipment
Speedup
Upper bound
Sorting algorithm
Enhancement
Integer
Computing

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Modelling and Simulation

Cite this

Jeon, M., & Kim, D. S. (2002). Load-balanced parallel merge sort on distributed memory parallel computers. In Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002 (pp. 248). [1016670] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IPDPS.2002.1016670

Load-balanced parallel merge sort on distributed memory parallel computers. / Jeon, Minsoo; Kim, Dong Seung.

Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002. Institute of Electrical and Electronics Engineers Inc., 2002. p. 248 1016670.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jeon, M & Kim, DS 2002, Load-balanced parallel merge sort on distributed memory parallel computers. in Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002., 1016670, Institute of Electrical and Electronics Engineers Inc., pp. 248, 16th International Parallel and Distributed Processing Symposium, IPDPS 2002, Ft. Lauderdale, United States, 02/4/15. https://doi.org/10.1109/IPDPS.2002.1016670
Jeon M, Kim DS. Load-balanced parallel merge sort on distributed memory parallel computers. In Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002. Institute of Electrical and Electronics Engineers Inc. 2002. p. 248. 1016670 https://doi.org/10.1109/IPDPS.2002.1016670
Jeon, Minsoo ; Kim, Dong Seung. / Load-balanced parallel merge sort on distributed memory parallel computers. Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002. Institute of Electrical and Electronics Engineers Inc., 2002. pp. 248
@inproceedings{66794770e49f4a2d9dd0c3f32a5b3465,
title = "Load-balanced parallel merge sort on distributed memory parallel computers",
abstract = "Sort can be speeded up on parallel computers by dividing and computing data individually in parallel. Merge sort can be parallelized, however, the conventional algorithm implemented on distributed memory computers has poor performance due to the successive reduction of the number of active (non-idling) processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort algorithm where all processors participate in merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in merging phase. Significant enhancement of the performance has been achieved. Our analysis shows the upper bound of the speedup of the merge time as (P - 1)/ log P. We have had a speedup of 9.6 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers. The same idea can be applied to parallellize other sorting algorithms.",
author = "Minsoo Jeon and Kim, {Dong Seung}",
year = "2002",
doi = "10.1109/IPDPS.2002.1016670",
language = "English",
isbn = "0769515738",
pages = "248",
booktitle = "Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Load-balanced parallel merge sort on distributed memory parallel computers

AU - Jeon, Minsoo

AU - Kim, Dong Seung

PY - 2002

Y1 - 2002

N2 - Sort can be speeded up on parallel computers by dividing and computing data individually in parallel. Merge sort can be parallelized, however, the conventional algorithm implemented on distributed memory computers has poor performance due to the successive reduction of the number of active (non-idling) processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort algorithm where all processors participate in merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in merging phase. Significant enhancement of the performance has been achieved. Our analysis shows the upper bound of the speedup of the merge time as (P - 1)/ log P. We have had a speedup of 9.6 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers. The same idea can be applied to parallellize other sorting algorithms.

AB - Sort can be speeded up on parallel computers by dividing and computing data individually in parallel. Merge sort can be parallelized, however, the conventional algorithm implemented on distributed memory computers has poor performance due to the successive reduction of the number of active (non-idling) processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort algorithm where all processors participate in merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in merging phase. Significant enhancement of the performance has been achieved. Our analysis shows the upper bound of the speedup of the merge time as (P - 1)/ log P. We have had a speedup of 9.6 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers. The same idea can be applied to parallellize other sorting algorithms.

UR - http://www.scopus.com/inward/record.url?scp=84966648411&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84966648411&partnerID=8YFLogxK

U2 - 10.1109/IPDPS.2002.1016670

DO - 10.1109/IPDPS.2002.1016670

M3 - Conference contribution

AN - SCOPUS:84966648411

SN - 0769515738

SN - 9780769515731

SP - 248

BT - Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002

PB - Institute of Electrical and Electronics Engineers Inc.

ER -