Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model

Ki Hwan Kim, Kyoungho Kim, Q Han Park

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

The Finite-Difference Time-Domain (FDTD) method is commonly used for electromagnetic field simulations. Recently, successful hardware-accelerations using Graphics Processing Unit (GPU) have been reported for the large-scale FDTD simulations. In this paper, we present a performance analysis of the three-dimensional (3D) FDTD on GPU using the roofline model. We find that theoretical predictions on maximum performance agrees well with the experimental results. We also suggest the suitable optimization methods for the best performance of FDTD on GPU. In particular, the optimized 3D FDTD program on GPU (NVIDIA Geforce GTX 480) is shown to be 64 times faster than the naively implemented program on CPU (Intel Core i7 2600).

Original languageEnglish
Pages (from-to)1201-1207
Number of pages7
JournalComputer Physics Communications
Volume182
Issue number6
DOIs
Publication statusPublished - 2011 Jun 1

Fingerprint

optimization
Finite difference time domain method
finite difference time domain method
Electromagnetic fields
Program processors
hardware
electromagnetic fields
simulation
Hardware
Graphics processing unit
predictions

Keywords

  • CUDA
  • FDTD
  • GPU
  • Roofline

ASJC Scopus subject areas

  • Hardware and Architecture
  • Physics and Astronomy(all)

Cite this

Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model. / Kim, Ki Hwan; Kim, Kyoungho; Park, Q Han.

In: Computer Physics Communications, Vol. 182, No. 6, 01.06.2011, p. 1201-1207.

Research output: Contribution to journalArticle

@article{e0bb70c13c134e1ca64c8139c0b86005,
title = "Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model",
abstract = "The Finite-Difference Time-Domain (FDTD) method is commonly used for electromagnetic field simulations. Recently, successful hardware-accelerations using Graphics Processing Unit (GPU) have been reported for the large-scale FDTD simulations. In this paper, we present a performance analysis of the three-dimensional (3D) FDTD on GPU using the roofline model. We find that theoretical predictions on maximum performance agrees well with the experimental results. We also suggest the suitable optimization methods for the best performance of FDTD on GPU. In particular, the optimized 3D FDTD program on GPU (NVIDIA Geforce GTX 480) is shown to be 64 times faster than the naively implemented program on CPU (Intel Core i7 2600).",
keywords = "CUDA, FDTD, GPU, Roofline",
author = "Kim, {Ki Hwan} and Kyoungho Kim and Park, {Q Han}",
year = "2011",
month = "6",
day = "1",
doi = "10.1016/j.cpc.2011.01.025",
language = "English",
volume = "182",
pages = "1201--1207",
journal = "Computer Physics Communications",
issn = "0010-4655",
publisher = "Elsevier",
number = "6",

}

TY - JOUR

T1 - Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model

AU - Kim, Ki Hwan

AU - Kim, Kyoungho

AU - Park, Q Han

PY - 2011/6/1

Y1 - 2011/6/1

N2 - The Finite-Difference Time-Domain (FDTD) method is commonly used for electromagnetic field simulations. Recently, successful hardware-accelerations using Graphics Processing Unit (GPU) have been reported for the large-scale FDTD simulations. In this paper, we present a performance analysis of the three-dimensional (3D) FDTD on GPU using the roofline model. We find that theoretical predictions on maximum performance agrees well with the experimental results. We also suggest the suitable optimization methods for the best performance of FDTD on GPU. In particular, the optimized 3D FDTD program on GPU (NVIDIA Geforce GTX 480) is shown to be 64 times faster than the naively implemented program on CPU (Intel Core i7 2600).

AB - The Finite-Difference Time-Domain (FDTD) method is commonly used for electromagnetic field simulations. Recently, successful hardware-accelerations using Graphics Processing Unit (GPU) have been reported for the large-scale FDTD simulations. In this paper, we present a performance analysis of the three-dimensional (3D) FDTD on GPU using the roofline model. We find that theoretical predictions on maximum performance agrees well with the experimental results. We also suggest the suitable optimization methods for the best performance of FDTD on GPU. In particular, the optimized 3D FDTD program on GPU (NVIDIA Geforce GTX 480) is shown to be 64 times faster than the naively implemented program on CPU (Intel Core i7 2600).

KW - CUDA

KW - FDTD

KW - GPU

KW - Roofline

UR - http://www.scopus.com/inward/record.url?scp=79953676563&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79953676563&partnerID=8YFLogxK

U2 - 10.1016/j.cpc.2011.01.025

DO - 10.1016/j.cpc.2011.01.025

M3 - Article

VL - 182

SP - 1201

EP - 1207

JO - Computer Physics Communications

JF - Computer Physics Communications

SN - 0010-4655

IS - 6

ER -