Coherence and Replacement Protocol of DICE - A Bus-Based COMA Multiprocessor

Sangyeun Cho, Jinseok Kong, Kyung Ho Lee

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

As microprocessors become faster and demand more bandwidth, the already limited scalability of a shared bus decreases even further. DICE, a shared-bus multiprocessor, utilizes cache only memory architecture (COMA) to effectively decrease the speed gap between modern high-performance microprocessors and the bus. DICE tries to optimize COMA for a shared-bus medium, in particular to reduce the detrimental effects of cache coherence and the "last memory block" problem on replacement. In this paper, we present the coherence and replacement protocol of the DICE multiprocessor and its design trade-offs. We describe a four-state write-invalidate coherence protocol in detail. Replacement, which poses a unique overhead problem of COMA, requires that a victim block with ownership be relocated to a remote node in order not to discard the last cached memory block. We show that the relocation process can be efficiently implemented by using a temporary storage called relocation buffer and a priority-based selection algorithm. We present performance results that show a drastic reduction in global bus traffic compared to a traditional shared-bus multiprocessor architecture.

Original languageEnglish
Pages (from-to)14-32
Number of pages19
JournalJournal of Parallel and Distributed Computing
Volume57
Issue number1
DOIs
Publication statusPublished - 1999 Apr 1
Externally publishedYes

Fingerprint

Memory architecture
Multiprocessor
Cache
Replacement
Network protocols
Microprocessor
Relocation
Microprocessor chips
Cache Coherence
Decrease
Data storage equipment
Buffer
Scalability
High Performance
Trade-offs
Bandwidth
Optimise
Architecture
Traffic
Vertex of a graph

Keywords

  • Distributed shared memory (DSM)
  • Shared bus
  • Symmetric multiprocessor (SMP)

ASJC Scopus subject areas

  • Computer Science Applications
  • Hardware and Architecture
  • Control and Systems Engineering

Cite this

Coherence and Replacement Protocol of DICE - A Bus-Based COMA Multiprocessor. / Cho, Sangyeun; Kong, Jinseok; Lee, Kyung Ho.

In: Journal of Parallel and Distributed Computing, Vol. 57, No. 1, 01.04.1999, p. 14-32.

Research output: Contribution to journalArticle

@article{776d1694a40942ebb91214ccef9935ce,
title = "Coherence and Replacement Protocol of DICE - A Bus-Based COMA Multiprocessor",
abstract = "As microprocessors become faster and demand more bandwidth, the already limited scalability of a shared bus decreases even further. DICE, a shared-bus multiprocessor, utilizes cache only memory architecture (COMA) to effectively decrease the speed gap between modern high-performance microprocessors and the bus. DICE tries to optimize COMA for a shared-bus medium, in particular to reduce the detrimental effects of cache coherence and the {"}last memory block{"} problem on replacement. In this paper, we present the coherence and replacement protocol of the DICE multiprocessor and its design trade-offs. We describe a four-state write-invalidate coherence protocol in detail. Replacement, which poses a unique overhead problem of COMA, requires that a victim block with ownership be relocated to a remote node in order not to discard the last cached memory block. We show that the relocation process can be efficiently implemented by using a temporary storage called relocation buffer and a priority-based selection algorithm. We present performance results that show a drastic reduction in global bus traffic compared to a traditional shared-bus multiprocessor architecture.",
keywords = "Distributed shared memory (DSM), Shared bus, Symmetric multiprocessor (SMP)",
author = "Sangyeun Cho and Jinseok Kong and Lee, {Kyung Ho}",
year = "1999",
month = "4",
day = "1",
doi = "10.1006/jpdc.1998.1524",
language = "English",
volume = "57",
pages = "14--32",
journal = "Journal of Parallel and Distributed Computing",
issn = "0743-7315",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Coherence and Replacement Protocol of DICE - A Bus-Based COMA Multiprocessor

AU - Cho, Sangyeun

AU - Kong, Jinseok

AU - Lee, Kyung Ho

PY - 1999/4/1

Y1 - 1999/4/1

N2 - As microprocessors become faster and demand more bandwidth, the already limited scalability of a shared bus decreases even further. DICE, a shared-bus multiprocessor, utilizes cache only memory architecture (COMA) to effectively decrease the speed gap between modern high-performance microprocessors and the bus. DICE tries to optimize COMA for a shared-bus medium, in particular to reduce the detrimental effects of cache coherence and the "last memory block" problem on replacement. In this paper, we present the coherence and replacement protocol of the DICE multiprocessor and its design trade-offs. We describe a four-state write-invalidate coherence protocol in detail. Replacement, which poses a unique overhead problem of COMA, requires that a victim block with ownership be relocated to a remote node in order not to discard the last cached memory block. We show that the relocation process can be efficiently implemented by using a temporary storage called relocation buffer and a priority-based selection algorithm. We present performance results that show a drastic reduction in global bus traffic compared to a traditional shared-bus multiprocessor architecture.

AB - As microprocessors become faster and demand more bandwidth, the already limited scalability of a shared bus decreases even further. DICE, a shared-bus multiprocessor, utilizes cache only memory architecture (COMA) to effectively decrease the speed gap between modern high-performance microprocessors and the bus. DICE tries to optimize COMA for a shared-bus medium, in particular to reduce the detrimental effects of cache coherence and the "last memory block" problem on replacement. In this paper, we present the coherence and replacement protocol of the DICE multiprocessor and its design trade-offs. We describe a four-state write-invalidate coherence protocol in detail. Replacement, which poses a unique overhead problem of COMA, requires that a victim block with ownership be relocated to a remote node in order not to discard the last cached memory block. We show that the relocation process can be efficiently implemented by using a temporary storage called relocation buffer and a priority-based selection algorithm. We present performance results that show a drastic reduction in global bus traffic compared to a traditional shared-bus multiprocessor architecture.

KW - Distributed shared memory (DSM)

KW - Shared bus

KW - Symmetric multiprocessor (SMP)

UR - http://www.scopus.com/inward/record.url?scp=0043231129&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0043231129&partnerID=8YFLogxK

U2 - 10.1006/jpdc.1998.1524

DO - 10.1006/jpdc.1998.1524

M3 - Article

AN - SCOPUS:0043231129

VL - 57

SP - 14

EP - 32

JO - Journal of Parallel and Distributed Computing

JF - Journal of Parallel and Distributed Computing

SN - 0743-7315

IS - 1

ER -