Specificity rule discovery in HIV-1 protease cleavage site analysis

Hyeoncheol Kim, Yiying Zhang, Yong Seok Heo, Heung Bum Oh, Su Shing Chen

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Several machine learning algorithms have recently been applied to modeling the specificity of HIV-1 protease. The problem is challenging because of the three issues as follows: (1) datasets with high dimensionality and small number of samples could misguide classification modeling and its interpretation; (2) symbolic interpretation is desirable because it provides us insight to the specificity in the form of human-understandable rules, and thus helps us to design effective HIV inhibitors; (3) the interpretation should take into account complexity or dependency between positions in sequences. Therefore, it is neccessary to investigate multivariate and feature-selective methods to model the specificity and to extract rules from the model. We have tested extensively various machine learning methods, and we have found that the combination of neural networks and decompositional approach can generate a set of effective rules. By validation to experimental results for the HIV-1 protease, the specificity rules outperform the ones generated by frequency-based, univariate or black-box methods.

Original languageEnglish
Pages (from-to)71-78
Number of pages8
JournalComputational Biology and Chemistry
Volume32
Issue number1
DOIs
Publication statusPublished - 2008 Feb 1

Fingerprint

protease
human immunodeficiency virus
Protease
Specificity
Learning systems
cleavage
machine learning
Learning algorithms
Machine Learning
Neural networks
Black Box
Modeling
HIV
inhibitors
Inhibitor
Univariate
Dimensionality
boxes
Learning Algorithm
Neural Networks

Keywords

  • HIV-1 cleavage site prediction rule discovery

ASJC Scopus subject areas

  • Biochemistry
  • Structural Biology
  • Analytical Chemistry
  • Physical and Theoretical Chemistry

Cite this

Specificity rule discovery in HIV-1 protease cleavage site analysis. / Kim, Hyeoncheol; Zhang, Yiying; Heo, Yong Seok; Oh, Heung Bum; Chen, Su Shing.

In: Computational Biology and Chemistry, Vol. 32, No. 1, 01.02.2008, p. 71-78.

Research output: Contribution to journalArticle

Kim, Hyeoncheol ; Zhang, Yiying ; Heo, Yong Seok ; Oh, Heung Bum ; Chen, Su Shing. / Specificity rule discovery in HIV-1 protease cleavage site analysis. In: Computational Biology and Chemistry. 2008 ; Vol. 32, No. 1. pp. 71-78.
@article{464d99f22bc3401f85505e67cdeaa899,
title = "Specificity rule discovery in HIV-1 protease cleavage site analysis",
abstract = "Several machine learning algorithms have recently been applied to modeling the specificity of HIV-1 protease. The problem is challenging because of the three issues as follows: (1) datasets with high dimensionality and small number of samples could misguide classification modeling and its interpretation; (2) symbolic interpretation is desirable because it provides us insight to the specificity in the form of human-understandable rules, and thus helps us to design effective HIV inhibitors; (3) the interpretation should take into account complexity or dependency between positions in sequences. Therefore, it is neccessary to investigate multivariate and feature-selective methods to model the specificity and to extract rules from the model. We have tested extensively various machine learning methods, and we have found that the combination of neural networks and decompositional approach can generate a set of effective rules. By validation to experimental results for the HIV-1 protease, the specificity rules outperform the ones generated by frequency-based, univariate or black-box methods.",
keywords = "HIV-1 cleavage site prediction rule discovery",
author = "Hyeoncheol Kim and Yiying Zhang and Heo, {Yong Seok} and Oh, {Heung Bum} and Chen, {Su Shing}",
year = "2008",
month = "2",
day = "1",
doi = "10.1016/j.compbiolchem.2007.09.006",
language = "English",
volume = "32",
pages = "71--78",
journal = "Computational Biology and Chemistry",
issn = "1476-9271",
publisher = "Elsevier Limited",
number = "1",

}

TY - JOUR

T1 - Specificity rule discovery in HIV-1 protease cleavage site analysis

AU - Kim, Hyeoncheol

AU - Zhang, Yiying

AU - Heo, Yong Seok

AU - Oh, Heung Bum

AU - Chen, Su Shing

PY - 2008/2/1

Y1 - 2008/2/1

N2 - Several machine learning algorithms have recently been applied to modeling the specificity of HIV-1 protease. The problem is challenging because of the three issues as follows: (1) datasets with high dimensionality and small number of samples could misguide classification modeling and its interpretation; (2) symbolic interpretation is desirable because it provides us insight to the specificity in the form of human-understandable rules, and thus helps us to design effective HIV inhibitors; (3) the interpretation should take into account complexity or dependency between positions in sequences. Therefore, it is neccessary to investigate multivariate and feature-selective methods to model the specificity and to extract rules from the model. We have tested extensively various machine learning methods, and we have found that the combination of neural networks and decompositional approach can generate a set of effective rules. By validation to experimental results for the HIV-1 protease, the specificity rules outperform the ones generated by frequency-based, univariate or black-box methods.

AB - Several machine learning algorithms have recently been applied to modeling the specificity of HIV-1 protease. The problem is challenging because of the three issues as follows: (1) datasets with high dimensionality and small number of samples could misguide classification modeling and its interpretation; (2) symbolic interpretation is desirable because it provides us insight to the specificity in the form of human-understandable rules, and thus helps us to design effective HIV inhibitors; (3) the interpretation should take into account complexity or dependency between positions in sequences. Therefore, it is neccessary to investigate multivariate and feature-selective methods to model the specificity and to extract rules from the model. We have tested extensively various machine learning methods, and we have found that the combination of neural networks and decompositional approach can generate a set of effective rules. By validation to experimental results for the HIV-1 protease, the specificity rules outperform the ones generated by frequency-based, univariate or black-box methods.

KW - HIV-1 cleavage site prediction rule discovery

UR - http://www.scopus.com/inward/record.url?scp=37549048173&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=37549048173&partnerID=8YFLogxK

U2 - 10.1016/j.compbiolchem.2007.09.006

DO - 10.1016/j.compbiolchem.2007.09.006

M3 - Article

C2 - 18006382

AN - SCOPUS:37549048173

VL - 32

SP - 71

EP - 78

JO - Computational Biology and Chemistry

JF - Computational Biology and Chemistry

SN - 1476-9271

IS - 1

ER -