Skew-tolerant key distribution for load balancing in mapreduce

Jihoon Son, Hyunsik Choi, Yon Dohn Chung

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)


MapReduce is a parallel processing framework for large scale data. In the reduce phase, MapReduce employs the hash scheme in order to distribute data sharing the same key across cluster nodes. However, this approach is not robust for the skewed data distribution. In this paper, we propose a skew-tolerant key distribution method for MapReduce. The proposed method assigns keys to cluster nodes balancing their workloads. We implemented our proposed method on Hadoop. Through experiments, we evaluate the performance of the proposed method in comparison with the conventional method.

Original languageEnglish
Pages (from-to)677-680
Number of pages4
JournalIEICE Transactions on Information and Systems
Issue number2
Publication statusPublished - 2012 Feb


  • Key distribution
  • Load balance
  • MapReduce
  • Skew-tolerance

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence


Dive into the research topics of 'Skew-tolerant key distribution for load balancing in mapreduce'. Together they form a unique fingerprint.

Cite this