An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction

Seokho Kang, Pilsung Kang, Taehoon Ko, Sungzoon Cho, Su Jin Rhee, Kyung Sang Yu

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

The treatment of patients with type 2 diabetes is mostly based on drug therapies, aiming at managing glucose levels appropriately. As the number of patients with type 2 diabetes continually increases worldwide, predicting drug treatment failure becomes an important issue. Support vector machine (SVM) can be a good method for the anti-diabetic drug failure prediction problem; however, it is difficult to train SVM on large-scale medical datasets directly because of its high training time complexity O(N3). To address the limitation, we propose an efficient and effective ensemble of SVMs, called E3-SVM. The proposed method excludes superfluous data points when constructing an SVM ensemble, thereby yielding a better classification performance. The proposed method consists of two phases. The first phase is to select the data points that are likely to be the support vectors by applying data selection methods. The second phase is to construct an SVM ensemble using the selected data points. We demonstrated the efficiency and effectiveness of the proposed method using the real-world dataset of the anti-diabetic drug failure prediction problem for type 2 diabetes. Experimental results show that the proposed method requires less training time to achieve comparable success, compared to the conventional SVM ensembles. Moreover, the proposed method obtains more reliable prediction results for each independent run of constructing an ensemble. In conclusion, firstly, the proposed method provides an efficient and effective way to use SVM for large-scale datasets. Secondly, we confirmed the suitability of SVM for the anti-diabetic drug failure prediction problem with an accuracy of about 80%.

Original languageEnglish
Pages (from-to)4265-4273
Number of pages9
JournalExpert Systems with Applications
Volume42
Issue number9
DOIs
Publication statusPublished - 2015 Jun 1

Fingerprint

Support vector machines
Medical problems
Drug therapy
Glucose

Keywords

  • Data selection
  • Drug failure prediction
  • Ensemble
  • Support vector machines
  • Type 2 diabetes

ASJC Scopus subject areas

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Cite this

An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction. / Kang, Seokho; Kang, Pilsung; Ko, Taehoon; Cho, Sungzoon; Rhee, Su Jin; Yu, Kyung Sang.

In: Expert Systems with Applications, Vol. 42, No. 9, 01.06.2015, p. 4265-4273.

Research output: Contribution to journalArticle

Kang, Seokho ; Kang, Pilsung ; Ko, Taehoon ; Cho, Sungzoon ; Rhee, Su Jin ; Yu, Kyung Sang. / An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction. In: Expert Systems with Applications. 2015 ; Vol. 42, No. 9. pp. 4265-4273.
@article{cb43f19010584e42837b5fa796a63d1c,
title = "An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction",
abstract = "The treatment of patients with type 2 diabetes is mostly based on drug therapies, aiming at managing glucose levels appropriately. As the number of patients with type 2 diabetes continually increases worldwide, predicting drug treatment failure becomes an important issue. Support vector machine (SVM) can be a good method for the anti-diabetic drug failure prediction problem; however, it is difficult to train SVM on large-scale medical datasets directly because of its high training time complexity O(N3). To address the limitation, we propose an efficient and effective ensemble of SVMs, called E3-SVM. The proposed method excludes superfluous data points when constructing an SVM ensemble, thereby yielding a better classification performance. The proposed method consists of two phases. The first phase is to select the data points that are likely to be the support vectors by applying data selection methods. The second phase is to construct an SVM ensemble using the selected data points. We demonstrated the efficiency and effectiveness of the proposed method using the real-world dataset of the anti-diabetic drug failure prediction problem for type 2 diabetes. Experimental results show that the proposed method requires less training time to achieve comparable success, compared to the conventional SVM ensembles. Moreover, the proposed method obtains more reliable prediction results for each independent run of constructing an ensemble. In conclusion, firstly, the proposed method provides an efficient and effective way to use SVM for large-scale datasets. Secondly, we confirmed the suitability of SVM for the anti-diabetic drug failure prediction problem with an accuracy of about 80{\%}.",
keywords = "Data selection, Drug failure prediction, Ensemble, Support vector machines, Type 2 diabetes",
author = "Seokho Kang and Pilsung Kang and Taehoon Ko and Sungzoon Cho and Rhee, {Su Jin} and Yu, {Kyung Sang}",
year = "2015",
month = "6",
day = "1",
doi = "10.1016/j.eswa.2015.01.042",
language = "English",
volume = "42",
pages = "4265--4273",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Limited",
number = "9",

}

TY - JOUR

T1 - An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction

AU - Kang, Seokho

AU - Kang, Pilsung

AU - Ko, Taehoon

AU - Cho, Sungzoon

AU - Rhee, Su Jin

AU - Yu, Kyung Sang

PY - 2015/6/1

Y1 - 2015/6/1

N2 - The treatment of patients with type 2 diabetes is mostly based on drug therapies, aiming at managing glucose levels appropriately. As the number of patients with type 2 diabetes continually increases worldwide, predicting drug treatment failure becomes an important issue. Support vector machine (SVM) can be a good method for the anti-diabetic drug failure prediction problem; however, it is difficult to train SVM on large-scale medical datasets directly because of its high training time complexity O(N3). To address the limitation, we propose an efficient and effective ensemble of SVMs, called E3-SVM. The proposed method excludes superfluous data points when constructing an SVM ensemble, thereby yielding a better classification performance. The proposed method consists of two phases. The first phase is to select the data points that are likely to be the support vectors by applying data selection methods. The second phase is to construct an SVM ensemble using the selected data points. We demonstrated the efficiency and effectiveness of the proposed method using the real-world dataset of the anti-diabetic drug failure prediction problem for type 2 diabetes. Experimental results show that the proposed method requires less training time to achieve comparable success, compared to the conventional SVM ensembles. Moreover, the proposed method obtains more reliable prediction results for each independent run of constructing an ensemble. In conclusion, firstly, the proposed method provides an efficient and effective way to use SVM for large-scale datasets. Secondly, we confirmed the suitability of SVM for the anti-diabetic drug failure prediction problem with an accuracy of about 80%.

AB - The treatment of patients with type 2 diabetes is mostly based on drug therapies, aiming at managing glucose levels appropriately. As the number of patients with type 2 diabetes continually increases worldwide, predicting drug treatment failure becomes an important issue. Support vector machine (SVM) can be a good method for the anti-diabetic drug failure prediction problem; however, it is difficult to train SVM on large-scale medical datasets directly because of its high training time complexity O(N3). To address the limitation, we propose an efficient and effective ensemble of SVMs, called E3-SVM. The proposed method excludes superfluous data points when constructing an SVM ensemble, thereby yielding a better classification performance. The proposed method consists of two phases. The first phase is to select the data points that are likely to be the support vectors by applying data selection methods. The second phase is to construct an SVM ensemble using the selected data points. We demonstrated the efficiency and effectiveness of the proposed method using the real-world dataset of the anti-diabetic drug failure prediction problem for type 2 diabetes. Experimental results show that the proposed method requires less training time to achieve comparable success, compared to the conventional SVM ensembles. Moreover, the proposed method obtains more reliable prediction results for each independent run of constructing an ensemble. In conclusion, firstly, the proposed method provides an efficient and effective way to use SVM for large-scale datasets. Secondly, we confirmed the suitability of SVM for the anti-diabetic drug failure prediction problem with an accuracy of about 80%.

KW - Data selection

KW - Drug failure prediction

KW - Ensemble

KW - Support vector machines

KW - Type 2 diabetes

UR - http://www.scopus.com/inward/record.url?scp=84923233733&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84923233733&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2015.01.042

DO - 10.1016/j.eswa.2015.01.042

M3 - Article

AN - SCOPUS:84923233733

VL - 42

SP - 4265

EP - 4273

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

IS - 9

ER -