BCGAN-based over-sampling scheme for imbalanced data

Minjae Son, Seungwon Jung, Jihoon Moon, Eenjun Hwang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Classification is a process of identifying the class to which input data belong. One of the most popular methods to do this is to construct a classification model by training a machine learning algorithm using a given set of data. For better classification performance, the dataset should have a balanced data distribution by class. If the dataset is imbalanced, that is, one class (minority class) has very fewer data than the other class (majority class); a model has little chance to learn about the minority class, and training is biased to the majority class. As a result, the model tends to classify any input to the majority class and does not handle data of the minority class properly. To overcome this data imbalance problem, we propose a novel over-sampling scheme based on Borderline-Conditional Generative Adversarial Networks (BCGAN). Our BCGAN generates data for the minority class, particularly along the borderline between majority class and minority class. Through various experiments on actual imbalanced datasets, we show the performance of our scheme.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020
EditorsWookey Lee, Luonan Chen, Yang-Sae Moon, Julien Bourgeois, Mehdi Bennis, Yu-Feng Li, Young-Guk Ha, Hyuk-Yoon Kwon, Alfredo Cuzzocrea
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages155-160
Number of pages6
ISBN (Electronic)9781728160344
DOIs
Publication statusPublished - 2020 Feb 1
Event2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020 - Busan, Korea, Republic of
Duration: 2020 Feb 192020 Feb 22

Publication series

NameProceedings - 2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020

Conference

Conference2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020
CountryKorea, Republic of
CityBusan
Period20/2/1920/2/22

Keywords

  • BCGAN
  • CGAN
  • Imbalanced data
  • Over-sampling

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems and Management
  • Control and Optimization

Fingerprint Dive into the research topics of 'BCGAN-based over-sampling scheme for imbalanced data'. Together they form a unique fingerprint.

Cite this