Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs

Lei Xu, Hanyee Kim, Xi Wang, Weidong Shi, Taeweon Suh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Read-mapping, i.e., finding certain patterns in a long DNA sequence, is an important operation for molecular biology. It is widely used in a variety of biological analyses including SNP discovery, genotyping and personal genomics. As next-generation DNA sequencing machines are generating an enormous amount of sequence data, it is a good choice to implement the read-mapping algorithm in the MapReduce framework and outsource the computation to the cloud. Data privacy becomes a big concern in this situation as DNA sequences are very sensitive. In response, encryption may be used to protect the data. However, it is very difficult for the cloud to process cipher texts. In the MapReduce framework, even if values (data to be processed) may be protected by encryption, keys cannot be encrypted using sematic secure encryption schemes as it will affect the MapReduce scheduling mechanism. But if no protection is utilized, attackers may extract useful information from unprotected keys. We propose a solution that can securely outsource read-mapping computations in the MapReduce framework by leveraging inherent tamper resistant properties of FPGAs. We also provide a method to protect the keys generated in this process. We implement our solution using FPGAs and apply it to some data sets. The security evaluation and experimental results show that with this method, DNA sequence privacy is well protected, and the extra cost is acceptable.

Original languageEnglish
Title of host publicationConference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Print)9783000446450
DOIs
Publication statusPublished - 2014 Jan 1
Event24th International Conference on Field Programmable Logic and Applications, FPL 2014 - Munich, Germany
Duration: 2014 Sep 12014 Sep 5

Other

Other24th International Conference on Field Programmable Logic and Applications, FPL 2014
CountryGermany
CityMunich
Period14/9/114/9/5

Fingerprint

DNA sequences
Cryptography
Field programmable gate arrays (FPGA)
DNA
Data privacy
Molecular biology
Scheduling
Costs

ASJC Scopus subject areas

  • Computer Science Applications
  • Hardware and Architecture

Cite this

Xu, L., Kim, H., Wang, X., Shi, W., & Suh, T. (2014). Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs. In Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014 [6927414] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/FPL.2014.6927414

Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs. / Xu, Lei; Kim, Hanyee; Wang, Xi; Shi, Weidong; Suh, Taeweon.

Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014. Institute of Electrical and Electronics Engineers Inc., 2014. 6927414.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Xu, L, Kim, H, Wang, X, Shi, W & Suh, T 2014, Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs. in Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014., 6927414, Institute of Electrical and Electronics Engineers Inc., 24th International Conference on Field Programmable Logic and Applications, FPL 2014, Munich, Germany, 14/9/1. https://doi.org/10.1109/FPL.2014.6927414
Xu L, Kim H, Wang X, Shi W, Suh T. Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs. In Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014. Institute of Electrical and Electronics Engineers Inc. 2014. 6927414 https://doi.org/10.1109/FPL.2014.6927414
Xu, Lei ; Kim, Hanyee ; Wang, Xi ; Shi, Weidong ; Suh, Taeweon. / Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs. Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014. Institute of Electrical and Electronics Engineers Inc., 2014.
@inproceedings{76d3a073c5e643c2adbd835d7253a5e1,
title = "Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs",
abstract = "Read-mapping, i.e., finding certain patterns in a long DNA sequence, is an important operation for molecular biology. It is widely used in a variety of biological analyses including SNP discovery, genotyping and personal genomics. As next-generation DNA sequencing machines are generating an enormous amount of sequence data, it is a good choice to implement the read-mapping algorithm in the MapReduce framework and outsource the computation to the cloud. Data privacy becomes a big concern in this situation as DNA sequences are very sensitive. In response, encryption may be used to protect the data. However, it is very difficult for the cloud to process cipher texts. In the MapReduce framework, even if values (data to be processed) may be protected by encryption, keys cannot be encrypted using sematic secure encryption schemes as it will affect the MapReduce scheduling mechanism. But if no protection is utilized, attackers may extract useful information from unprotected keys. We propose a solution that can securely outsource read-mapping computations in the MapReduce framework by leveraging inherent tamper resistant properties of FPGAs. We also provide a method to protect the keys generated in this process. We implement our solution using FPGAs and apply it to some data sets. The security evaluation and experimental results show that with this method, DNA sequence privacy is well protected, and the extra cost is acceptable.",
author = "Lei Xu and Hanyee Kim and Xi Wang and Weidong Shi and Taeweon Suh",
year = "2014",
month = "1",
day = "1",
doi = "10.1109/FPL.2014.6927414",
language = "English",
isbn = "9783000446450",
booktitle = "Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs

AU - Xu, Lei

AU - Kim, Hanyee

AU - Wang, Xi

AU - Shi, Weidong

AU - Suh, Taeweon

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Read-mapping, i.e., finding certain patterns in a long DNA sequence, is an important operation for molecular biology. It is widely used in a variety of biological analyses including SNP discovery, genotyping and personal genomics. As next-generation DNA sequencing machines are generating an enormous amount of sequence data, it is a good choice to implement the read-mapping algorithm in the MapReduce framework and outsource the computation to the cloud. Data privacy becomes a big concern in this situation as DNA sequences are very sensitive. In response, encryption may be used to protect the data. However, it is very difficult for the cloud to process cipher texts. In the MapReduce framework, even if values (data to be processed) may be protected by encryption, keys cannot be encrypted using sematic secure encryption schemes as it will affect the MapReduce scheduling mechanism. But if no protection is utilized, attackers may extract useful information from unprotected keys. We propose a solution that can securely outsource read-mapping computations in the MapReduce framework by leveraging inherent tamper resistant properties of FPGAs. We also provide a method to protect the keys generated in this process. We implement our solution using FPGAs and apply it to some data sets. The security evaluation and experimental results show that with this method, DNA sequence privacy is well protected, and the extra cost is acceptable.

AB - Read-mapping, i.e., finding certain patterns in a long DNA sequence, is an important operation for molecular biology. It is widely used in a variety of biological analyses including SNP discovery, genotyping and personal genomics. As next-generation DNA sequencing machines are generating an enormous amount of sequence data, it is a good choice to implement the read-mapping algorithm in the MapReduce framework and outsource the computation to the cloud. Data privacy becomes a big concern in this situation as DNA sequences are very sensitive. In response, encryption may be used to protect the data. However, it is very difficult for the cloud to process cipher texts. In the MapReduce framework, even if values (data to be processed) may be protected by encryption, keys cannot be encrypted using sematic secure encryption schemes as it will affect the MapReduce scheduling mechanism. But if no protection is utilized, attackers may extract useful information from unprotected keys. We propose a solution that can securely outsource read-mapping computations in the MapReduce framework by leveraging inherent tamper resistant properties of FPGAs. We also provide a method to protect the keys generated in this process. We implement our solution using FPGAs and apply it to some data sets. The security evaluation and experimental results show that with this method, DNA sequence privacy is well protected, and the extra cost is acceptable.

UR - http://www.scopus.com/inward/record.url?scp=84911164047&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84911164047&partnerID=8YFLogxK

U2 - 10.1109/FPL.2014.6927414

DO - 10.1109/FPL.2014.6927414

M3 - Conference contribution

AN - SCOPUS:84911164047

SN - 9783000446450

BT - Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014

PB - Institute of Electrical and Electronics Engineers Inc.

ER -