Securing a Local Training Dataset Size in Federated Learning

Young Ah Shin, Geontae Noh, Ik Rae Jeong, Ji Young Chun

Research output: Contribution to journalArticlepeer-review

Abstract

Federated learning (FL) is an emerging paradigm that helps to train a global machine learning (ML) model by utilizing decentralized data among clients without sharing them. Although FL is a more secure way of model training than a general ML, industries where training data are primarily personal information, such as MRI images or Electronic Health Records (EHR), should be more precautious of privacy and security issues when using FL. For example, unbalanced dataset sizes may denote some meaningful information that can lead to security vulnerabilities even if the training data of the clients are not exposed. In this paper, we present a Privacy-Preserving Federated Averaging (PP-FedAvg) protocol specialized for healthcare settings to limit user data privacy leakage in FL. We particularly protect the size of datasets as well as the aggregated local update parameters by securely computing among clients based on homomorphic encryption. This approach ensures that the server does not access the size of datasets and local update parameters while updating the global model. Our protocol has the advantage of protecting the size of datasets when datasets are not uniformly distributed among clients and when some clients drop out each iteration.

Original languageEnglish
Pages (from-to)104135-104143
Number of pages9
JournalIEEE Access
Volume10
DOIs
Publication statusPublished - 2022

Keywords

  • Federated learning
  • homomorphic encryption
  • privacy-preserving
  • training dataset

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Securing a Local Training Dataset Size in Federated Learning'. Together they form a unique fingerprint.

Cite this