Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises

Ehsan Adeli, Kim Han Thung, Le An, Guorong Wu, Feng Shi, Tao Wang, Dinggang Shen

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Discriminative methods commonly produce models with relatively good generalization abilities. However, this advantage is challenged in real-world applications (e.g., medical image analysis problems), in which there often exist outlier data points (sample-outliers) and noises in the predictor values (feature-noises). Methods robust to both types of these deviations are somewhat overlooked in the literature. We further argue that denoising can be more effective, if we learn the model using all the available labeled and unlabeled samples, as the intrinsic geometry of the sample manifold can be better constructed using more data points. In this paper, we propose a semi-supervised robust discriminative classification method based on the least-squares formulation of linear discriminant analysis to detect sample-outliers and feature-noises simultaneously, using both labeled training and unlabeled testing data. We conduct several experiments on a synthetic, some benchmark semi-supervised learning, and two brain neurodegenerative disease diagnosis datasets (for Parkinson's and Alzheimer's diseases). Specifically for the application of neurodegenerative diseases diagnosis, incorporating robust machine learning methods can be of great benefit, due to the noisy nature of neuroimaging data. Our results show that our method outperforms the baseline and several state-of-the-art methods, in terms of both accuracy and the area under the ROC curve.

Original languageEnglish
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
DOIs
Publication statusAccepted/In press - 2018 Jan 16

Fingerprint

Neurodegenerative diseases
Outlier
Noise
Neuroimaging
Supervised learning
Medical applications
Discriminant analysis
Image analysis
Learning systems
Brain
Neurodegenerative Diseases
Geometry
Medical Image Analysis
Testing
Semi-supervised Learning
Sample point
Receiver Operating Characteristic Curve
Robust Methods
Benchmarking
Denoising

Keywords

  • Alzheimer's disease
  • biomarker identification
  • Biomedical imaging
  • Data models
  • disease diagnosis
  • Diseases
  • feature selection
  • Linear discriminant analysis
  • Noise reduction
  • nuclear norm
  • Parkinson's disease
  • regularization
  • robust classification
  • Robustness
  • sample outlier detection
  • semi-supervised learning
  • Testing
  • Training

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computational Theory and Mathematics
  • Artificial Intelligence
  • Applied Mathematics

Cite this

Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises. / Adeli, Ehsan; Thung, Kim Han; An, Le; Wu, Guorong; Shi, Feng; Wang, Tao; Shen, Dinggang.

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 16.01.2018.

Research output: Contribution to journalArticle

@article{64c8aefe368c418c923ee6fc840c9f68,
title = "Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises",
abstract = "Discriminative methods commonly produce models with relatively good generalization abilities. However, this advantage is challenged in real-world applications (e.g., medical image analysis problems), in which there often exist outlier data points (sample-outliers) and noises in the predictor values (feature-noises). Methods robust to both types of these deviations are somewhat overlooked in the literature. We further argue that denoising can be more effective, if we learn the model using all the available labeled and unlabeled samples, as the intrinsic geometry of the sample manifold can be better constructed using more data points. In this paper, we propose a semi-supervised robust discriminative classification method based on the least-squares formulation of linear discriminant analysis to detect sample-outliers and feature-noises simultaneously, using both labeled training and unlabeled testing data. We conduct several experiments on a synthetic, some benchmark semi-supervised learning, and two brain neurodegenerative disease diagnosis datasets (for Parkinson's and Alzheimer's diseases). Specifically for the application of neurodegenerative diseases diagnosis, incorporating robust machine learning methods can be of great benefit, due to the noisy nature of neuroimaging data. Our results show that our method outperforms the baseline and several state-of-the-art methods, in terms of both accuracy and the area under the ROC curve.",
keywords = "Alzheimer's disease, biomarker identification, Biomedical imaging, Data models, disease diagnosis, Diseases, feature selection, Linear discriminant analysis, Noise reduction, nuclear norm, Parkinson's disease, regularization, robust classification, Robustness, sample outlier detection, semi-supervised learning, Testing, Training",
author = "Ehsan Adeli and Thung, {Kim Han} and Le An and Guorong Wu and Feng Shi and Tao Wang and Dinggang Shen",
year = "2018",
month = "1",
day = "16",
doi = "10.1109/TPAMI.2018.2794470",
language = "English",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
issn = "0162-8828",
publisher = "IEEE Computer Society",

}

TY - JOUR

T1 - Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises

AU - Adeli, Ehsan

AU - Thung, Kim Han

AU - An, Le

AU - Wu, Guorong

AU - Shi, Feng

AU - Wang, Tao

AU - Shen, Dinggang

PY - 2018/1/16

Y1 - 2018/1/16

N2 - Discriminative methods commonly produce models with relatively good generalization abilities. However, this advantage is challenged in real-world applications (e.g., medical image analysis problems), in which there often exist outlier data points (sample-outliers) and noises in the predictor values (feature-noises). Methods robust to both types of these deviations are somewhat overlooked in the literature. We further argue that denoising can be more effective, if we learn the model using all the available labeled and unlabeled samples, as the intrinsic geometry of the sample manifold can be better constructed using more data points. In this paper, we propose a semi-supervised robust discriminative classification method based on the least-squares formulation of linear discriminant analysis to detect sample-outliers and feature-noises simultaneously, using both labeled training and unlabeled testing data. We conduct several experiments on a synthetic, some benchmark semi-supervised learning, and two brain neurodegenerative disease diagnosis datasets (for Parkinson's and Alzheimer's diseases). Specifically for the application of neurodegenerative diseases diagnosis, incorporating robust machine learning methods can be of great benefit, due to the noisy nature of neuroimaging data. Our results show that our method outperforms the baseline and several state-of-the-art methods, in terms of both accuracy and the area under the ROC curve.

AB - Discriminative methods commonly produce models with relatively good generalization abilities. However, this advantage is challenged in real-world applications (e.g., medical image analysis problems), in which there often exist outlier data points (sample-outliers) and noises in the predictor values (feature-noises). Methods robust to both types of these deviations are somewhat overlooked in the literature. We further argue that denoising can be more effective, if we learn the model using all the available labeled and unlabeled samples, as the intrinsic geometry of the sample manifold can be better constructed using more data points. In this paper, we propose a semi-supervised robust discriminative classification method based on the least-squares formulation of linear discriminant analysis to detect sample-outliers and feature-noises simultaneously, using both labeled training and unlabeled testing data. We conduct several experiments on a synthetic, some benchmark semi-supervised learning, and two brain neurodegenerative disease diagnosis datasets (for Parkinson's and Alzheimer's diseases). Specifically for the application of neurodegenerative diseases diagnosis, incorporating robust machine learning methods can be of great benefit, due to the noisy nature of neuroimaging data. Our results show that our method outperforms the baseline and several state-of-the-art methods, in terms of both accuracy and the area under the ROC curve.

KW - Alzheimer's disease

KW - biomarker identification

KW - Biomedical imaging

KW - Data models

KW - disease diagnosis

KW - Diseases

KW - feature selection

KW - Linear discriminant analysis

KW - Noise reduction

KW - nuclear norm

KW - Parkinson's disease

KW - regularization

KW - robust classification

KW - Robustness

KW - sample outlier detection

KW - semi-supervised learning

KW - Testing

KW - Training

UR - http://www.scopus.com/inward/record.url?scp=85041421672&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041421672&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2018.2794470

DO - 10.1109/TPAMI.2018.2794470

M3 - Article

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

ER -