Revealing critical loads and hidden data locality in GPGPU applications

Gunjae Koo, Hyeran Jeon, Murali Annavaram

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

In graphics processing units (GPUs), memory access latency is one of the most critical performance hurdles. Several warp schedulers and memory prefetching algorithms have been proposed to avoid the long memory access latency. Prior application characterization studies shed light on the interaction between applications, GPU micro architecture and memory subsystem behavior. Most of these studies, however, only present aggregate statistics on how memory system behaves over the entire application run. In particular, they do not consider how individual load instructions in a program contribute to the aggregate memory system behavior. The analysis presented in this paper shows that there are two distinct classes of load instructions, categorized as deterministic and non-deterministic loads. Using a combination of profiling data from a real GPU card and cycle accurate simulation data we show that there is a significant performance impact disparity when executing these two types of loads. We discuss and suggest several approaches to treat these two load categories differently within the GPU micro architecture for optimizing memory system performance.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE International Symposium on Workload Characterization, IISWC 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages120-129
Number of pages10
ISBN (Electronic)9781509000883
DOIs
Publication statusPublished - 2015 Oct 30
Externally publishedYes
EventIEEE International Symposium on Workload Characterization, IISWC 2015 - Atlanta, United States
Duration: 2015 Oct 42015 Oct 6

Publication series

NameProceedings - 2015 IEEE International Symposium on Workload Characterization, IISWC 2015

Conference

ConferenceIEEE International Symposium on Workload Characterization, IISWC 2015
Country/TerritoryUnited States
CityAtlanta
Period15/10/415/10/6

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Revealing critical loads and hidden data locality in GPGPU applications'. Together they form a unique fingerprint.

Cite this