Byte-index chunking algorithm for data deduplication system

Ider Lkhagvasuren, Jung Min So, Jeong Gun Lee, Hyuck Yoo, Young Woong Ko

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

This paper presents an algorithm and structure for a deduplication method which can be efficiently used for eliminating identical data between files existing different machines with high rate and performing it within rapid time. The algorithm predicts identical parts between source and destination files very fast, and then assures the identical parts and transfers only those parts of blocks that proved to be unique region. The fundamental aspect of reaching faster and high scalability determining duplicate result is that data are expressed as fixed-size block chunks which are distributed to "Index-table" by chunk's both side boundary values. "Index-table" is a fixed sized table structure; chunk's boundary byte values are used as their cell row and column numbers. Experiment result shows that the proposed solution enhances data deduplication performance and reduces data storage capacity extensively.

Original languageEnglish
Pages (from-to)415-424
Number of pages10
JournalInternational Journal of Security and its Applications
Volume7
Issue number5
DOIs
Publication statusPublished - 2013 Oct 31

Fingerprint

Scalability
Data storage equipment
Experiments

Keywords

  • Anchor byte
  • Byte-index table
  • Chunk
  • Deduplication
  • Index-table

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Byte-index chunking algorithm for data deduplication system. / Lkhagvasuren, Ider; So, Jung Min; Lee, Jeong Gun; Yoo, Hyuck; Ko, Young Woong.

In: International Journal of Security and its Applications, Vol. 7, No. 5, 31.10.2013, p. 415-424.

Research output: Contribution to journalArticle

Lkhagvasuren, Ider ; So, Jung Min ; Lee, Jeong Gun ; Yoo, Hyuck ; Ko, Young Woong. / Byte-index chunking algorithm for data deduplication system. In: International Journal of Security and its Applications. 2013 ; Vol. 7, No. 5. pp. 415-424.
@article{ad2f810b9f7b402faaf0278fb4f1326a,
title = "Byte-index chunking algorithm for data deduplication system",
abstract = "This paper presents an algorithm and structure for a deduplication method which can be efficiently used for eliminating identical data between files existing different machines with high rate and performing it within rapid time. The algorithm predicts identical parts between source and destination files very fast, and then assures the identical parts and transfers only those parts of blocks that proved to be unique region. The fundamental aspect of reaching faster and high scalability determining duplicate result is that data are expressed as fixed-size block chunks which are distributed to {"}Index-table{"} by chunk's both side boundary values. {"}Index-table{"} is a fixed sized table structure; chunk's boundary byte values are used as their cell row and column numbers. Experiment result shows that the proposed solution enhances data deduplication performance and reduces data storage capacity extensively.",
keywords = "Anchor byte, Byte-index table, Chunk, Deduplication, Index-table",
author = "Ider Lkhagvasuren and So, {Jung Min} and Lee, {Jeong Gun} and Hyuck Yoo and Ko, {Young Woong}",
year = "2013",
month = "10",
day = "31",
doi = "10.14257/ijsia.2013.7.5.38",
language = "English",
volume = "7",
pages = "415--424",
journal = "International Journal of Security and its Applications",
issn = "1738-9976",
publisher = "Science and Engineering Research Support Society",
number = "5",

}

TY - JOUR

T1 - Byte-index chunking algorithm for data deduplication system

AU - Lkhagvasuren, Ider

AU - So, Jung Min

AU - Lee, Jeong Gun

AU - Yoo, Hyuck

AU - Ko, Young Woong

PY - 2013/10/31

Y1 - 2013/10/31

N2 - This paper presents an algorithm and structure for a deduplication method which can be efficiently used for eliminating identical data between files existing different machines with high rate and performing it within rapid time. The algorithm predicts identical parts between source and destination files very fast, and then assures the identical parts and transfers only those parts of blocks that proved to be unique region. The fundamental aspect of reaching faster and high scalability determining duplicate result is that data are expressed as fixed-size block chunks which are distributed to "Index-table" by chunk's both side boundary values. "Index-table" is a fixed sized table structure; chunk's boundary byte values are used as their cell row and column numbers. Experiment result shows that the proposed solution enhances data deduplication performance and reduces data storage capacity extensively.

AB - This paper presents an algorithm and structure for a deduplication method which can be efficiently used for eliminating identical data between files existing different machines with high rate and performing it within rapid time. The algorithm predicts identical parts between source and destination files very fast, and then assures the identical parts and transfers only those parts of blocks that proved to be unique region. The fundamental aspect of reaching faster and high scalability determining duplicate result is that data are expressed as fixed-size block chunks which are distributed to "Index-table" by chunk's both side boundary values. "Index-table" is a fixed sized table structure; chunk's boundary byte values are used as their cell row and column numbers. Experiment result shows that the proposed solution enhances data deduplication performance and reduces data storage capacity extensively.

KW - Anchor byte

KW - Byte-index table

KW - Chunk

KW - Deduplication

KW - Index-table

UR - http://www.scopus.com/inward/record.url?scp=84886577332&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886577332&partnerID=8YFLogxK

U2 - 10.14257/ijsia.2013.7.5.38

DO - 10.14257/ijsia.2013.7.5.38

M3 - Article

VL - 7

SP - 415

EP - 424

JO - International Journal of Security and its Applications

JF - International Journal of Security and its Applications

SN - 1738-9976

IS - 5

ER -