Cloud-based mapreduce workflow execution platform

In Yong Jung, Byong John Han, Chang Sung Jeong, Seungmin Rho

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

With increasing demand of data-intensive applications, mapreduce technologies have become useful tools to develop large scale applications efficiently by integrating various existing mapreduce jobs. However, there are few existing researches of workflow systems which can integrates mapreduce jobs with on-demand cloud resource provisioning. In this paper, we present a new cloud-based mapreduce workflow execution platform named DIVE-CWM (Distributed-parallel Virtual Environment on Cloud computing for Workflow for launching Mapreduce jobs) which integrates multiple mapreduce jobs and legacy programs into a single workflow. It provides a transparent and selective job scheduling by estimating the execution time in advance for workflow to execute all its jobs. Also, it supports automatic resource provisioning scheme which offers on-demand VM resources automatically to launch a workflow onto cloud. Furthermore, it provides an agent based resource management for automatic job deployment and execution of workflow on mapreduce clusters. Additionally, service oriented architecture based on web service API and graphical user interface offers high accessibility and convenience to user and other systems. We show the experimental results which compares the different scheduling schemes for various workflows.

Original languageEnglish
Pages (from-to)1059-1067
Number of pages9
JournalJournal of Internet Technology
Volume15
Issue number6
DOIs
Publication statusPublished - 2014

Keywords

  • Cloud computing
  • Job scheduling
  • Mapreduce workflow
  • PaaS

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Cloud-based mapreduce workflow execution platform'. Together they form a unique fingerprint.

Cite this