Improving cancer classification accuracy using gene pairs

Pankaj Chopra, Jinseung Lee, Jaewoo Kang, Sunwon Lee

Research output: Contribution to journalArticle

38 Citations (Scopus)

Abstract

Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN).

Original languageEnglish
Article numbere14305
JournalPLoS One
Volume5
Issue number12
DOIs
Publication statusPublished - 2010 Dec 1

Fingerprint

Deregulation
Genes
taxonomy
neoplasms
Neoplasms
genes
Decision Trees
Microarray Analysis
Microarrays
Tumor Biomarkers
Decision trees
carcinogenesis
Support vector machines
biomarkers
Carcinogenesis
prediction
Datasets

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Improving cancer classification accuracy using gene pairs. / Chopra, Pankaj; Lee, Jinseung; Kang, Jaewoo; Lee, Sunwon.

In: PLoS One, Vol. 5, No. 12, e14305, 01.12.2010.

Research output: Contribution to journalArticle

Chopra, Pankaj ; Lee, Jinseung ; Kang, Jaewoo ; Lee, Sunwon. / Improving cancer classification accuracy using gene pairs. In: PLoS One. 2010 ; Vol. 5, No. 12.
@article{42b0ffebba6c4de1aa942a7e0f0c149c,
title = "Improving cancer classification accuracy using gene pairs",
abstract = "Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN).",
author = "Pankaj Chopra and Jinseung Lee and Jaewoo Kang and Sunwon Lee",
year = "2010",
month = "12",
day = "1",
doi = "10.1371/journal.pone.0014305",
language = "English",
volume = "5",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "12",

}

TY - JOUR

T1 - Improving cancer classification accuracy using gene pairs

AU - Chopra, Pankaj

AU - Lee, Jinseung

AU - Kang, Jaewoo

AU - Lee, Sunwon

PY - 2010/12/1

Y1 - 2010/12/1

N2 - Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN).

AB - Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN).

UR - http://www.scopus.com/inward/record.url?scp=78650982063&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78650982063&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0014305

DO - 10.1371/journal.pone.0014305

M3 - Article

C2 - 21200431

AN - SCOPUS:78650982063

VL - 5

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 12

M1 - e14305

ER -