Compiler and hardware support for cache coherence in large-scale multiprocessors: design considerations and performance study

Lynn Choi, Pen Chung Yew

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In this paper, we study a hardware-supported, compiler-directed (HSCD) cache coherence scheme, which can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors, such as the Cray T3D. It can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration have also been addressed. The cost of the required hardware support is small and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data-flow analysis, have been implemented on the Polaris compiler [17]. From our simulation study using the Perfect Club benchmarks, we found that, in spite of the conservative analysis made by the compiler, the performance of the proposed HSCD scheme can be comparable to that of a full-map hardware directory scheme. With its comparable performance and reduced hardware cost, the scheme can be a viable alternative for large-scale multiprocessors, such as the Cray T3D, that rely on users to maintain data coherence.

Original languageEnglish
Title of host publicationConference Proceedings - Annual International Symposium on Computer Architecture, ISCA
PublisherIEEE
Pages283-294
Number of pages12
Publication statusPublished - 1996
Externally publishedYes
EventProceedings of the 1996 23rd Annual International Symposium on Computer Architecture - Philadelphia, PA, USA
Duration: 1996 May 221996 May 24

Other

OtherProceedings of the 1996 23rd Annual International Symposium on Computer Architecture
CityPhiladelphia, PA, USA
Period96/5/2296/5/24

Fingerprint

Hardware
Data flow analysis
Microprocessor chips
Costs
Communication

ASJC Scopus subject areas

  • Hardware and Architecture

Cite this

Choi, L., & Yew, P. C. (1996). Compiler and hardware support for cache coherence in large-scale multiprocessors: design considerations and performance study. In Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA (pp. 283-294). IEEE.

Compiler and hardware support for cache coherence in large-scale multiprocessors : design considerations and performance study. / Choi, Lynn; Yew, Pen Chung.

Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. IEEE, 1996. p. 283-294.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Choi, L & Yew, PC 1996, Compiler and hardware support for cache coherence in large-scale multiprocessors: design considerations and performance study. in Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. IEEE, pp. 283-294, Proceedings of the 1996 23rd Annual International Symposium on Computer Architecture, Philadelphia, PA, USA, 96/5/22.
Choi L, Yew PC. Compiler and hardware support for cache coherence in large-scale multiprocessors: design considerations and performance study. In Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. IEEE. 1996. p. 283-294
Choi, Lynn ; Yew, Pen Chung. / Compiler and hardware support for cache coherence in large-scale multiprocessors : design considerations and performance study. Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. IEEE, 1996. pp. 283-294
@inproceedings{70d333d395454d4984eeed9bb433741e,
title = "Compiler and hardware support for cache coherence in large-scale multiprocessors: design considerations and performance study",
abstract = "In this paper, we study a hardware-supported, compiler-directed (HSCD) cache coherence scheme, which can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors, such as the Cray T3D. It can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration have also been addressed. The cost of the required hardware support is small and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data-flow analysis, have been implemented on the Polaris compiler [17]. From our simulation study using the Perfect Club benchmarks, we found that, in spite of the conservative analysis made by the compiler, the performance of the proposed HSCD scheme can be comparable to that of a full-map hardware directory scheme. With its comparable performance and reduced hardware cost, the scheme can be a viable alternative for large-scale multiprocessors, such as the Cray T3D, that rely on users to maintain data coherence.",
author = "Lynn Choi and Yew, {Pen Chung}",
year = "1996",
language = "English",
pages = "283--294",
booktitle = "Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA",
publisher = "IEEE",

}

TY - GEN

T1 - Compiler and hardware support for cache coherence in large-scale multiprocessors

T2 - design considerations and performance study

AU - Choi, Lynn

AU - Yew, Pen Chung

PY - 1996

Y1 - 1996

N2 - In this paper, we study a hardware-supported, compiler-directed (HSCD) cache coherence scheme, which can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors, such as the Cray T3D. It can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration have also been addressed. The cost of the required hardware support is small and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data-flow analysis, have been implemented on the Polaris compiler [17]. From our simulation study using the Perfect Club benchmarks, we found that, in spite of the conservative analysis made by the compiler, the performance of the proposed HSCD scheme can be comparable to that of a full-map hardware directory scheme. With its comparable performance and reduced hardware cost, the scheme can be a viable alternative for large-scale multiprocessors, such as the Cray T3D, that rely on users to maintain data coherence.

AB - In this paper, we study a hardware-supported, compiler-directed (HSCD) cache coherence scheme, which can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors, such as the Cray T3D. It can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration have also been addressed. The cost of the required hardware support is small and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data-flow analysis, have been implemented on the Polaris compiler [17]. From our simulation study using the Perfect Club benchmarks, we found that, in spite of the conservative analysis made by the compiler, the performance of the proposed HSCD scheme can be comparable to that of a full-map hardware directory scheme. With its comparable performance and reduced hardware cost, the scheme can be a viable alternative for large-scale multiprocessors, such as the Cray T3D, that rely on users to maintain data coherence.

UR - http://www.scopus.com/inward/record.url?scp=0029666633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0029666633&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0029666633

SP - 283

EP - 294

BT - Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA

PB - IEEE

ER -