What are the optimum quasi-identifiers to re-identify medical records?

Yong Ju Lee, Kyung Ho Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Recently, medical records are shared to online for a purpose of medical research and expert opinion. There is a problem with sharing the medical records. If someone knows the subject of the record by using various methods, it can result in an invasion of the patient's privacy. To solve the problem, it is important to carefully address the tradeoff between data sharing and privacy. For this reason, de-identification techniques are applicable to address the problem. However, de-identified data has a risk of re-identification. There are two problems with using de-identification techniques. First, de-identification techniques may damage data utility although it may decrease a risk of re-identification. Second, de-identified data can be re-identified from inference using background knowledge. The objective of this paper is to analyze the probability of re-identification according to inferable quasi-identifiers. We analyzed factors, inferable quasi-identifiers, which can be inferred from background knowledge. Then, we estimated the probability of re-identification from taking advantage of the factors. As a result, we determined the effect of the re-identification according to the type and the range of inferable quasi-identifiers. This paper contributes to a decision on de-identification target and level for protecting patient's privacy through a comparative analysis of the probability of re-identification according to the type and the range of inference.

Original languageEnglish
Title of host publicationIEEE 20th International Conference on Advanced Communication Technology
Subtitle of host publicationOpening New Era of Intelligent Things, ICACT 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1025-1033
Number of pages9
Volume2018-February
ISBN (Electronic)9791188428007
DOIs
Publication statusPublished - 2018 Mar 23
Event20th IEEE International Conference on Advanced Communication Technology, ICACT 2018 - Chuncheon, Korea, Republic of
Duration: 2018 Feb 112018 Feb 14

Other

Other20th IEEE International Conference on Advanced Communication Technology, ICACT 2018
CountryKorea, Republic of
CityChuncheon
Period18/2/1118/2/14

Keywords

  • De-identification
  • Medical records
  • Privacy
  • Re-identification

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Lee, Y. J., & Lee, K. H. (2018). What are the optimum quasi-identifiers to re-identify medical records? In IEEE 20th International Conference on Advanced Communication Technology: Opening New Era of Intelligent Things, ICACT 2018 (Vol. 2018-February, pp. 1025-1033). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.23919/ICACT.2018.8323926

What are the optimum quasi-identifiers to re-identify medical records? / Lee, Yong Ju; Lee, Kyung Ho.

IEEE 20th International Conference on Advanced Communication Technology: Opening New Era of Intelligent Things, ICACT 2018. Vol. 2018-February Institute of Electrical and Electronics Engineers Inc., 2018. p. 1025-1033.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, YJ & Lee, KH 2018, What are the optimum quasi-identifiers to re-identify medical records? in IEEE 20th International Conference on Advanced Communication Technology: Opening New Era of Intelligent Things, ICACT 2018. vol. 2018-February, Institute of Electrical and Electronics Engineers Inc., pp. 1025-1033, 20th IEEE International Conference on Advanced Communication Technology, ICACT 2018, Chuncheon, Korea, Republic of, 18/2/11. https://doi.org/10.23919/ICACT.2018.8323926
Lee YJ, Lee KH. What are the optimum quasi-identifiers to re-identify medical records? In IEEE 20th International Conference on Advanced Communication Technology: Opening New Era of Intelligent Things, ICACT 2018. Vol. 2018-February. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1025-1033 https://doi.org/10.23919/ICACT.2018.8323926
Lee, Yong Ju ; Lee, Kyung Ho. / What are the optimum quasi-identifiers to re-identify medical records?. IEEE 20th International Conference on Advanced Communication Technology: Opening New Era of Intelligent Things, ICACT 2018. Vol. 2018-February Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1025-1033
@inproceedings{6af69ca69bc0436c8d55bb8b69114b40,
title = "What are the optimum quasi-identifiers to re-identify medical records?",
abstract = "Recently, medical records are shared to online for a purpose of medical research and expert opinion. There is a problem with sharing the medical records. If someone knows the subject of the record by using various methods, it can result in an invasion of the patient's privacy. To solve the problem, it is important to carefully address the tradeoff between data sharing and privacy. For this reason, de-identification techniques are applicable to address the problem. However, de-identified data has a risk of re-identification. There are two problems with using de-identification techniques. First, de-identification techniques may damage data utility although it may decrease a risk of re-identification. Second, de-identified data can be re-identified from inference using background knowledge. The objective of this paper is to analyze the probability of re-identification according to inferable quasi-identifiers. We analyzed factors, inferable quasi-identifiers, which can be inferred from background knowledge. Then, we estimated the probability of re-identification from taking advantage of the factors. As a result, we determined the effect of the re-identification according to the type and the range of inferable quasi-identifiers. This paper contributes to a decision on de-identification target and level for protecting patient's privacy through a comparative analysis of the probability of re-identification according to the type and the range of inference.",
keywords = "De-identification, Medical records, Privacy, Re-identification",
author = "Lee, {Yong Ju} and Lee, {Kyung Ho}",
year = "2018",
month = "3",
day = "23",
doi = "10.23919/ICACT.2018.8323926",
language = "English",
volume = "2018-February",
pages = "1025--1033",
booktitle = "IEEE 20th International Conference on Advanced Communication Technology",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - What are the optimum quasi-identifiers to re-identify medical records?

AU - Lee, Yong Ju

AU - Lee, Kyung Ho

PY - 2018/3/23

Y1 - 2018/3/23

N2 - Recently, medical records are shared to online for a purpose of medical research and expert opinion. There is a problem with sharing the medical records. If someone knows the subject of the record by using various methods, it can result in an invasion of the patient's privacy. To solve the problem, it is important to carefully address the tradeoff between data sharing and privacy. For this reason, de-identification techniques are applicable to address the problem. However, de-identified data has a risk of re-identification. There are two problems with using de-identification techniques. First, de-identification techniques may damage data utility although it may decrease a risk of re-identification. Second, de-identified data can be re-identified from inference using background knowledge. The objective of this paper is to analyze the probability of re-identification according to inferable quasi-identifiers. We analyzed factors, inferable quasi-identifiers, which can be inferred from background knowledge. Then, we estimated the probability of re-identification from taking advantage of the factors. As a result, we determined the effect of the re-identification according to the type and the range of inferable quasi-identifiers. This paper contributes to a decision on de-identification target and level for protecting patient's privacy through a comparative analysis of the probability of re-identification according to the type and the range of inference.

AB - Recently, medical records are shared to online for a purpose of medical research and expert opinion. There is a problem with sharing the medical records. If someone knows the subject of the record by using various methods, it can result in an invasion of the patient's privacy. To solve the problem, it is important to carefully address the tradeoff between data sharing and privacy. For this reason, de-identification techniques are applicable to address the problem. However, de-identified data has a risk of re-identification. There are two problems with using de-identification techniques. First, de-identification techniques may damage data utility although it may decrease a risk of re-identification. Second, de-identified data can be re-identified from inference using background knowledge. The objective of this paper is to analyze the probability of re-identification according to inferable quasi-identifiers. We analyzed factors, inferable quasi-identifiers, which can be inferred from background knowledge. Then, we estimated the probability of re-identification from taking advantage of the factors. As a result, we determined the effect of the re-identification according to the type and the range of inferable quasi-identifiers. This paper contributes to a decision on de-identification target and level for protecting patient's privacy through a comparative analysis of the probability of re-identification according to the type and the range of inference.

KW - De-identification

KW - Medical records

KW - Privacy

KW - Re-identification

UR - http://www.scopus.com/inward/record.url?scp=85046813819&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046813819&partnerID=8YFLogxK

U2 - 10.23919/ICACT.2018.8323926

DO - 10.23919/ICACT.2018.8323926

M3 - Conference contribution

AN - SCOPUS:85046813819

VL - 2018-February

SP - 1025

EP - 1033

BT - IEEE 20th International Conference on Advanced Communication Technology

PB - Institute of Electrical and Electronics Engineers Inc.

ER -