Fast sort of floating-point data for data engineering

Changsoo Kim, Sungroh Yoon, Dong Seung Kim

Research output: Contribution to journalArticle

Abstract

In this paper, a novel external sort algorithm that improves the speedup of the sorting of floating-point numbers has been described. Our algorithm decreases the computation time significantly by applying integer arithmetic on floating-point data in the IEEE-754 standard or similar formats. We conducted experiments with synthetic data on a 32-processor Linux cluster; in the case of the internal sort alone, the Giga-byte sorting achieved approximately fivefold speedups. Furthermore, the sorting achieved twofold or greater improvements over the typical parallel sort method, network of workstations (NOW)-sort. Moreover, the sorting scheme performance is independent of the computing platform. Thus, our sorting method can be successfully applied to binary search, data mining, numerical simulations, and graphics.

Original languageEnglish
Pages (from-to)50-54
Number of pages5
JournalAdvances in Engineering Software
Volume42
Issue number1-2
DOIs
Publication statusPublished - 2011 Jan 1

Fingerprint

Sorting
Data mining
Computer simulation
Experiments

Keywords

  • Engineering simulation
  • External sort
  • Floating-point arithmetic
  • Message passing interface
  • Parallel sort
  • Workstation cluster

ASJC Scopus subject areas

  • Software
  • Engineering(all)

Cite this

Fast sort of floating-point data for data engineering. / Kim, Changsoo; Yoon, Sungroh; Kim, Dong Seung.

In: Advances in Engineering Software, Vol. 42, No. 1-2, 01.01.2011, p. 50-54.

Research output: Contribution to journalArticle

Kim, Changsoo ; Yoon, Sungroh ; Kim, Dong Seung. / Fast sort of floating-point data for data engineering. In: Advances in Engineering Software. 2011 ; Vol. 42, No. 1-2. pp. 50-54.
@article{27726757407046228bde5e17f687156b,
title = "Fast sort of floating-point data for data engineering",
abstract = "In this paper, a novel external sort algorithm that improves the speedup of the sorting of floating-point numbers has been described. Our algorithm decreases the computation time significantly by applying integer arithmetic on floating-point data in the IEEE-754 standard or similar formats. We conducted experiments with synthetic data on a 32-processor Linux cluster; in the case of the internal sort alone, the Giga-byte sorting achieved approximately fivefold speedups. Furthermore, the sorting achieved twofold or greater improvements over the typical parallel sort method, network of workstations (NOW)-sort. Moreover, the sorting scheme performance is independent of the computing platform. Thus, our sorting method can be successfully applied to binary search, data mining, numerical simulations, and graphics.",
keywords = "Engineering simulation, External sort, Floating-point arithmetic, Message passing interface, Parallel sort, Workstation cluster",
author = "Changsoo Kim and Sungroh Yoon and Kim, {Dong Seung}",
year = "2011",
month = "1",
day = "1",
doi = "10.1016/j.advengsoft.2010.10.017",
language = "English",
volume = "42",
pages = "50--54",
journal = "Advances in Engineering Software",
issn = "0965-9978",
publisher = "Elsevier Limited",
number = "1-2",

}

TY - JOUR

T1 - Fast sort of floating-point data for data engineering

AU - Kim, Changsoo

AU - Yoon, Sungroh

AU - Kim, Dong Seung

PY - 2011/1/1

Y1 - 2011/1/1

N2 - In this paper, a novel external sort algorithm that improves the speedup of the sorting of floating-point numbers has been described. Our algorithm decreases the computation time significantly by applying integer arithmetic on floating-point data in the IEEE-754 standard or similar formats. We conducted experiments with synthetic data on a 32-processor Linux cluster; in the case of the internal sort alone, the Giga-byte sorting achieved approximately fivefold speedups. Furthermore, the sorting achieved twofold or greater improvements over the typical parallel sort method, network of workstations (NOW)-sort. Moreover, the sorting scheme performance is independent of the computing platform. Thus, our sorting method can be successfully applied to binary search, data mining, numerical simulations, and graphics.

AB - In this paper, a novel external sort algorithm that improves the speedup of the sorting of floating-point numbers has been described. Our algorithm decreases the computation time significantly by applying integer arithmetic on floating-point data in the IEEE-754 standard or similar formats. We conducted experiments with synthetic data on a 32-processor Linux cluster; in the case of the internal sort alone, the Giga-byte sorting achieved approximately fivefold speedups. Furthermore, the sorting achieved twofold or greater improvements over the typical parallel sort method, network of workstations (NOW)-sort. Moreover, the sorting scheme performance is independent of the computing platform. Thus, our sorting method can be successfully applied to binary search, data mining, numerical simulations, and graphics.

KW - Engineering simulation

KW - External sort

KW - Floating-point arithmetic

KW - Message passing interface

KW - Parallel sort

KW - Workstation cluster

UR - http://www.scopus.com/inward/record.url?scp=84857628543&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857628543&partnerID=8YFLogxK

U2 - 10.1016/j.advengsoft.2010.10.017

DO - 10.1016/j.advengsoft.2010.10.017

M3 - Article

VL - 42

SP - 50

EP - 54

JO - Advances in Engineering Software

JF - Advances in Engineering Software

SN - 0965-9978

IS - 1-2

ER -