Parallelly Running k-Nearest Neighbor Classification over Semantically Secure Encrypted Data in Outsourced Environments

Jeongsu Park, Dong Hoon Lee

Research output: Contribution to journalArticle

Abstract

Cloud services with powerful resources are popularly used to manage exponentially increasing data and to carry out data mining to analyze the data. However, a data mining involving query can cause privacy problems by disclosing both the data and the query. One task in data mining, classification, is used in a wide range of applications, and we focus on k -nearest neighbor (k NN) in this study to realize classification. Although several studies have already attempted to address the privacy problems associated with k NN computation in a cloud environment, the results of these studies are still inefficient. In this paper, we propose a very efficient and privacy-preserving k NN classification (PkNC) over encrypted data. While the amount of computation (encryptions/decryptions and exponentiations) and communication of the most efficient k NN classification proposed in prior studies is bounded by O(kln) , that of the proposed Pk NC is bounded by O(ln) , where l is the domain size of data and n is the number of data. When conducting experiments with the same dataset, the prior k NN classification took 12.02 to 55.5 minutes but Pk NC took 4.16 minutes. Furthermore, since Pk NC allows to be carried out in parallel for each data, its performance can be improved extremely if it is carried out on machine to allow more numerous threads. Pk NC protects the privacy of dataset, input query including the k NN result, and does not disclose any data access patterns. We propose several protocols to serve as building blocks to construct Pk NC and formally prove their security. In particular, we propose efficient protocols that privately find k largest or smallest elements in array.

Original languageEnglish
Article number9051788
Pages (from-to)64617-64633
Number of pages17
JournalIEEE Access
Volume8
DOIs
Publication statusPublished - 2020 Jan 1

Keywords

  • cloud computing
  • k-nearest neighbor classification
  • privacy-preserving data mining

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Cite this