Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets

Feng Zhao, Islem Rekik, Seong Whan Lee, Jing Liu, Junying Zhang, Dinggang Shen, Jose Garcia-Rodriguez

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

As a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.

Original languageEnglish
Article number5937274
JournalComplexity
Volume2019
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Principal component analysis
Learning systems

ASJC Scopus subject areas

  • General

Cite this

Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets. / Zhao, Feng; Rekik, Islem; Lee, Seong Whan; Liu, Jing; Zhang, Junying; Shen, Dinggang; Garcia-Rodriguez, Jose.

In: Complexity, Vol. 2019, 5937274, 01.01.2019.

Research output: Contribution to journalArticle

Zhao, Feng ; Rekik, Islem ; Lee, Seong Whan ; Liu, Jing ; Zhang, Junying ; Shen, Dinggang ; Garcia-Rodriguez, Jose. / Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets. In: Complexity. 2019 ; Vol. 2019.
@article{513dad2f2b9044edacbbe72c3e29439f,
title = "Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets",
abstract = "As a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.",
author = "Feng Zhao and Islem Rekik and Lee, {Seong Whan} and Jing Liu and Junying Zhang and Dinggang Shen and Jose Garcia-Rodriguez",
year = "2019",
month = "1",
day = "1",
doi = "10.1155/2019/5937274",
language = "English",
volume = "2019",
journal = "Complexity",
issn = "1076-2787",
publisher = "John Wiley and Sons Inc.",

}

TY - JOUR

T1 - Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets

AU - Zhao, Feng

AU - Rekik, Islem

AU - Lee, Seong Whan

AU - Liu, Jing

AU - Zhang, Junying

AU - Shen, Dinggang

AU - Garcia-Rodriguez, Jose

PY - 2019/1/1

Y1 - 2019/1/1

N2 - As a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.

AB - As a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.

UR - http://www.scopus.com/inward/record.url?scp=85062375049&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062375049&partnerID=8YFLogxK

U2 - 10.1155/2019/5937274

DO - 10.1155/2019/5937274

M3 - Article

VL - 2019

JO - Complexity

JF - Complexity

SN - 1076-2787

M1 - 5937274

ER -