Replicated process allocation for load distribution in fault-tolerant multicomputers

Jong Kim, Heejo Lee, Sunggu Lee

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

In this paper, we consider a load-balancing process allocation method for fault-tolerant multicomputer systems that balances the load before as well as after faults start to degrade the performance of the system. In order to be able to tolerate a single fault, each process (primary process) is duplicated (i.e., has a backup process). The backup process executes on a different processor from the primary, checkpointing the primary process and recovering the process if the primary process fails. In this paper, we formalize the problem of load-balancing process allocation and propose a new process allocation method and analyze the performance of the proposed method. Simulations are used to compare the proposed method with a process allocation method that does not take into account the different load characteristics of the primary and backup processes. While both methods perform well before the occurrence of a fault, only the proposed method maintains a balanced load after the occurrence of such a fault.

Original languageEnglish
Pages (from-to)499-505
Number of pages7
JournalIEEE Transactions on Computers
Volume46
Issue number4
DOIs
Publication statusPublished - 1997

Keywords

  • Backup process
  • Checkpointing
  • Fault-tolerant multicomputer
  • Load balancing
  • Process allocation

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Replicated process allocation for load distribution in fault-tolerant multicomputers'. Together they form a unique fingerprint.

  • Cite this