Improving SSH detection model using IPA time and WGAN-GP

Junwon Lee, Heejo Lee

Research output: Contribution to journalArticlepeer-review

Abstract

In the machine learning-based detection model, the detection accuracy tends to be proportional to the quantity and quality of the training dataset. The machine learning-based SSH detection model's performance is affected by the size of the training dataset and the ratio of target classes. However, in an actual network environment within a short period, it is inconvenient to collect a sufficient and diverse training dataset. Even though many training data samples are collected, it takes a lot of effort and time to prepare the training dataset through data classification. To overcome these limitations, we generate sophisticated samples using the WGAN-GP algorithm and present how to select samples by comparing generator loss. The synthetic training dataset with generated samples improves the performance of the SSH detection model. Furthermore, we add the new features to include the distinction of inter-packet arrival time. The enhanced SSH detection model decreases false positives and provides a 0.999 F1-score by applying the synthetic dataset and the packet inter-arrival time features.

Original languageEnglish
Article number102672
JournalComputers and Security
Volume116
DOIs
Publication statusPublished - 2022 May

Keywords

  • GAN
  • Generator loss
  • Inter-packet arrival time
  • PCA
  • Random forest
  • Session-based data
  • SSH detection
  • WGAN-GP

ASJC Scopus subject areas

  • Computer Science(all)
  • Law

Fingerprint

Dive into the research topics of 'Improving SSH detection model using IPA time and WGAN-GP'. Together they form a unique fingerprint.

Cite this