Employing gene set top scoring pairs to identify deregulated pathway-signatures in dilated cardiomyopathy from integrated microarray gene expression data

Aik-Choon Tan

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

It is well accepted that a set of genes must act in concert to drive various cellular processes. However, under different biological phenotypes, not all the members of a gene set will participate in a biological process. Hence, it is useful to construct a discriminative classifier by focusing on the core members (subset) of a highly informative gene set. Such analyses can reveal which of those subsets from the same gene set correspond to different biological phenotypes. In this study, we propose Gene Set Top Scoring Pairs (GSTSP) approach that exploits the simple yet powerful relative expression reversal concept at the gene set levels to achieve these goals. To illustrate the usefulness of GSTSP, we applied this method to five different human heart failure gene expression data sets. We take advantage of the direct data integration feature in the GSTSP approach to combine two data sets, identify a discriminative gene set from >190 predefined gene sets, and evaluate the predictive power of the GSTSP classifier derived from this informative gene set on three independent test sets (79.31% in test accuracy). The discriminative gene pairs identified in this study may provide new biological understanding on the disturbed pathways that are involved in the development of heart failure. GSTSP methodology is general in purpose and is applicable to a variety of phenotypic classification problems using gene expression data.

Original languageEnglish
Title of host publicationMethods in Molecular Biology
Pages345-361
Number of pages17
Volume802
DOIs
Publication statusPublished - 2012 Jan 2
Externally publishedYes

Publication series

NameMethods in Molecular Biology
Volume802
ISSN (Print)10643745

Fingerprint

Dilated Cardiomyopathy
Gene Expression
Genes
Heart Failure
Phenotype
Biological Phenomena

Keywords

  • Gene expression
  • Gene set analysis
  • Microarray
  • Relative expression classifier
  • Top scoring pairs

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Cite this

Employing gene set top scoring pairs to identify deregulated pathway-signatures in dilated cardiomyopathy from integrated microarray gene expression data. / Tan, Aik-Choon.

Methods in Molecular Biology. Vol. 802 2012. p. 345-361 (Methods in Molecular Biology; Vol. 802).

Research output: Chapter in Book/Report/Conference proceedingChapter

Tan, Aik-Choon. / Employing gene set top scoring pairs to identify deregulated pathway-signatures in dilated cardiomyopathy from integrated microarray gene expression data. Methods in Molecular Biology. Vol. 802 2012. pp. 345-361 (Methods in Molecular Biology).
@inbook{b6ec9b72bf8f459d9103a22dedaf576a,
title = "Employing gene set top scoring pairs to identify deregulated pathway-signatures in dilated cardiomyopathy from integrated microarray gene expression data",
abstract = "It is well accepted that a set of genes must act in concert to drive various cellular processes. However, under different biological phenotypes, not all the members of a gene set will participate in a biological process. Hence, it is useful to construct a discriminative classifier by focusing on the core members (subset) of a highly informative gene set. Such analyses can reveal which of those subsets from the same gene set correspond to different biological phenotypes. In this study, we propose Gene Set Top Scoring Pairs (GSTSP) approach that exploits the simple yet powerful relative expression reversal concept at the gene set levels to achieve these goals. To illustrate the usefulness of GSTSP, we applied this method to five different human heart failure gene expression data sets. We take advantage of the direct data integration feature in the GSTSP approach to combine two data sets, identify a discriminative gene set from >190 predefined gene sets, and evaluate the predictive power of the GSTSP classifier derived from this informative gene set on three independent test sets (79.31{\%} in test accuracy). The discriminative gene pairs identified in this study may provide new biological understanding on the disturbed pathways that are involved in the development of heart failure. GSTSP methodology is general in purpose and is applicable to a variety of phenotypic classification problems using gene expression data.",
keywords = "Gene expression, Gene set analysis, Microarray, Relative expression classifier, Top scoring pairs",
author = "Aik-Choon Tan",
year = "2012",
month = "1",
day = "2",
doi = "10.1007/978-1-61779-400-1_23",
language = "English",
isbn = "9781617793998",
volume = "802",
series = "Methods in Molecular Biology",
pages = "345--361",
booktitle = "Methods in Molecular Biology",

}

TY - CHAP

T1 - Employing gene set top scoring pairs to identify deregulated pathway-signatures in dilated cardiomyopathy from integrated microarray gene expression data

AU - Tan, Aik-Choon

PY - 2012/1/2

Y1 - 2012/1/2

N2 - It is well accepted that a set of genes must act in concert to drive various cellular processes. However, under different biological phenotypes, not all the members of a gene set will participate in a biological process. Hence, it is useful to construct a discriminative classifier by focusing on the core members (subset) of a highly informative gene set. Such analyses can reveal which of those subsets from the same gene set correspond to different biological phenotypes. In this study, we propose Gene Set Top Scoring Pairs (GSTSP) approach that exploits the simple yet powerful relative expression reversal concept at the gene set levels to achieve these goals. To illustrate the usefulness of GSTSP, we applied this method to five different human heart failure gene expression data sets. We take advantage of the direct data integration feature in the GSTSP approach to combine two data sets, identify a discriminative gene set from >190 predefined gene sets, and evaluate the predictive power of the GSTSP classifier derived from this informative gene set on three independent test sets (79.31% in test accuracy). The discriminative gene pairs identified in this study may provide new biological understanding on the disturbed pathways that are involved in the development of heart failure. GSTSP methodology is general in purpose and is applicable to a variety of phenotypic classification problems using gene expression data.

AB - It is well accepted that a set of genes must act in concert to drive various cellular processes. However, under different biological phenotypes, not all the members of a gene set will participate in a biological process. Hence, it is useful to construct a discriminative classifier by focusing on the core members (subset) of a highly informative gene set. Such analyses can reveal which of those subsets from the same gene set correspond to different biological phenotypes. In this study, we propose Gene Set Top Scoring Pairs (GSTSP) approach that exploits the simple yet powerful relative expression reversal concept at the gene set levels to achieve these goals. To illustrate the usefulness of GSTSP, we applied this method to five different human heart failure gene expression data sets. We take advantage of the direct data integration feature in the GSTSP approach to combine two data sets, identify a discriminative gene set from >190 predefined gene sets, and evaluate the predictive power of the GSTSP classifier derived from this informative gene set on three independent test sets (79.31% in test accuracy). The discriminative gene pairs identified in this study may provide new biological understanding on the disturbed pathways that are involved in the development of heart failure. GSTSP methodology is general in purpose and is applicable to a variety of phenotypic classification problems using gene expression data.

KW - Gene expression

KW - Gene set analysis

KW - Microarray

KW - Relative expression classifier

KW - Top scoring pairs

UR - http://www.scopus.com/inward/record.url?scp=84555190003&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84555190003&partnerID=8YFLogxK

U2 - 10.1007/978-1-61779-400-1_23

DO - 10.1007/978-1-61779-400-1_23

M3 - Chapter

SN - 9781617793998

VL - 802

T3 - Methods in Molecular Biology

SP - 345

EP - 361

BT - Methods in Molecular Biology

ER -