An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems

Taeweon Suh, Shih Lien Lu, Hsien Hsin S Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Recently, there is a surge of interests in using FPGAs for computer architecture research including applications from emulating and analyzing a new platform to accelerating microarchitecural simulation speed for design space exploration. This paper proposes and demonstrates a novel usage of FPGAs for measuring the efficiency of coherent traffic of an actual computer system. Our approach employs an FPGA acting as a bus agent, interacting with a real CPU in a dual processor system to measure the intrinsic delay of co-herence traffic. This technique eliminates non-deterministic factors in the measurement, such as the arbitration delay and stall in the pipelined bus. It completely isolates the impact of pure coherence traffic delay on system performance while executing workloads natively. Our experiments show that the overall execution time of the benchmark programs on a system with coherence traffic was actually increased over one without coherent traffic. It indicates that cache-to-cache transfers are less efficient in an Intel-based server system, and there exists room for further improvement such as the inclusion of the O state and cache line buffers in the memory controller.

Original languageEnglish
Title of host publicationProceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL
Pages47-53
Number of pages7
DOIs
Publication statusPublished - 2007 Dec 1
Externally publishedYes
Event2007 International Conference on Field Programmable Logic and Applications, FPL - Amsterdam, Netherlands
Duration: 2007 Aug 272007 Aug 29

Other

Other2007 International Conference on Field Programmable Logic and Applications, FPL
CountryNetherlands
CityAmsterdam
Period07/8/2707/8/29

Fingerprint

Field programmable gate arrays (FPGA)
Program processors
Computer systems
Computer architecture
Servers
Data storage equipment
Controllers
Experiments

ASJC Scopus subject areas

  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

Suh, T., Lu, S. L., & Lee, H. H. S. (2007). An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems. In Proceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL (pp. 47-53). [4380624] https://doi.org/10.1109/FPL.2007.4380624

An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems. / Suh, Taeweon; Lu, Shih Lien; Lee, Hsien Hsin S.

Proceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL. 2007. p. 47-53 4380624.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Suh, T, Lu, SL & Lee, HHS 2007, An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems. in Proceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL., 4380624, pp. 47-53, 2007 International Conference on Field Programmable Logic and Applications, FPL, Amsterdam, Netherlands, 07/8/27. https://doi.org/10.1109/FPL.2007.4380624
Suh T, Lu SL, Lee HHS. An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems. In Proceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL. 2007. p. 47-53. 4380624 https://doi.org/10.1109/FPL.2007.4380624
Suh, Taeweon ; Lu, Shih Lien ; Lee, Hsien Hsin S. / An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems. Proceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL. 2007. pp. 47-53
@inproceedings{754323d96ac84975a69573976b3f77f8,
title = "An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems",
abstract = "Recently, there is a surge of interests in using FPGAs for computer architecture research including applications from emulating and analyzing a new platform to accelerating microarchitecural simulation speed for design space exploration. This paper proposes and demonstrates a novel usage of FPGAs for measuring the efficiency of coherent traffic of an actual computer system. Our approach employs an FPGA acting as a bus agent, interacting with a real CPU in a dual processor system to measure the intrinsic delay of co-herence traffic. This technique eliminates non-deterministic factors in the measurement, such as the arbitration delay and stall in the pipelined bus. It completely isolates the impact of pure coherence traffic delay on system performance while executing workloads natively. Our experiments show that the overall execution time of the benchmark programs on a system with coherence traffic was actually increased over one without coherent traffic. It indicates that cache-to-cache transfers are less efficient in an Intel-based server system, and there exists room for further improvement such as the inclusion of the O state and cache line buffers in the memory controller.",
author = "Taeweon Suh and Lu, {Shih Lien} and Lee, {Hsien Hsin S}",
year = "2007",
month = "12",
day = "1",
doi = "10.1109/FPL.2007.4380624",
language = "English",
isbn = "1424410606",
pages = "47--53",
booktitle = "Proceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL",

}

TY - GEN

T1 - An FPGA approach to quantifying coherence traffic efficiency on multiprocessor systems

AU - Suh, Taeweon

AU - Lu, Shih Lien

AU - Lee, Hsien Hsin S

PY - 2007/12/1

Y1 - 2007/12/1

N2 - Recently, there is a surge of interests in using FPGAs for computer architecture research including applications from emulating and analyzing a new platform to accelerating microarchitecural simulation speed for design space exploration. This paper proposes and demonstrates a novel usage of FPGAs for measuring the efficiency of coherent traffic of an actual computer system. Our approach employs an FPGA acting as a bus agent, interacting with a real CPU in a dual processor system to measure the intrinsic delay of co-herence traffic. This technique eliminates non-deterministic factors in the measurement, such as the arbitration delay and stall in the pipelined bus. It completely isolates the impact of pure coherence traffic delay on system performance while executing workloads natively. Our experiments show that the overall execution time of the benchmark programs on a system with coherence traffic was actually increased over one without coherent traffic. It indicates that cache-to-cache transfers are less efficient in an Intel-based server system, and there exists room for further improvement such as the inclusion of the O state and cache line buffers in the memory controller.

AB - Recently, there is a surge of interests in using FPGAs for computer architecture research including applications from emulating and analyzing a new platform to accelerating microarchitecural simulation speed for design space exploration. This paper proposes and demonstrates a novel usage of FPGAs for measuring the efficiency of coherent traffic of an actual computer system. Our approach employs an FPGA acting as a bus agent, interacting with a real CPU in a dual processor system to measure the intrinsic delay of co-herence traffic. This technique eliminates non-deterministic factors in the measurement, such as the arbitration delay and stall in the pipelined bus. It completely isolates the impact of pure coherence traffic delay on system performance while executing workloads natively. Our experiments show that the overall execution time of the benchmark programs on a system with coherence traffic was actually increased over one without coherent traffic. It indicates that cache-to-cache transfers are less efficient in an Intel-based server system, and there exists room for further improvement such as the inclusion of the O state and cache line buffers in the memory controller.

UR - http://www.scopus.com/inward/record.url?scp=48149102869&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=48149102869&partnerID=8YFLogxK

U2 - 10.1109/FPL.2007.4380624

DO - 10.1109/FPL.2007.4380624

M3 - Conference contribution

AN - SCOPUS:48149102869

SN - 1424410606

SN - 9781424410606

SP - 47

EP - 53

BT - Proceedings - 2007 International Conference on Field Programmable Logic and Applications, FPL

ER -