Model Selection for Data Analysis in Encrypted Domain: Application to Simple Linear Regression

Mi Yeon Hong, Ji Won Yoon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In the big data era, data scientists explore machine learning methods for observed data to predict or classify. For machine learining to be effective, it requires access to raw data which is often privacy sensitive. In addition, whatever data and fitting procedures are employed, a crucial step is to select the most appropriate model from the given dataset. Model selection is a key ingredient in data analysis for reliable and reproducible statistical inference or prediction. To address this issue, we develop new techniques to provide solutions for running model selection over encrypted data. Our approach provides the best approximation of the relationship between the dependent and independent variable through cross validation. After performing 4-fold cross validation, 4 different estimates of our model’s errors are calculated. And then we use bias and variance extracted from these errors to find the best model. We perform an experiment on a dataset extracted from Kaggle and show that our approach can homomorphically regress a given encrypted data without decrypting it.

Original languageEnglish
Title of host publicationInformation Security Applications - 20th International Conference, WISA 2019, Revised Selected Papers
EditorsIlsun You
PublisherSpringer
Pages155-166
Number of pages12
ISBN (Print)9783030393021
DOIs
Publication statusPublished - 2020 Jan 1
Event20th World Conference on Information Security Applications, WISA 2019 - Jeju Island, Korea, Republic of
Duration: 2019 Aug 212019 Aug 24

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11897 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference20th World Conference on Information Security Applications, WISA 2019
CountryKorea, Republic of
CityJeju Island
Period19/8/2119/8/24

Keywords

  • Fully Homomorphic Encryption
  • Model selection
  • TFHE

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Model Selection for Data Analysis in Encrypted Domain: Application to Simple Linear Regression'. Together they form a unique fingerprint.

  • Cite this

    Hong, M. Y., & Yoon, J. W. (2020). Model Selection for Data Analysis in Encrypted Domain: Application to Simple Linear Regression. In I. You (Ed.), Information Security Applications - 20th International Conference, WISA 2019, Revised Selected Papers (pp. 155-166). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11897 LNCS). Springer. https://doi.org/10.1007/978-3-030-39303-8_12