High precision rule based ppi extraction and per-pair basis performance evaluation

Junkyu Lee, Seongsoon Kim, Sunwon Lee, Kyubum Lee, Jaewoo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall. However, in many realistic scenarios involving large corpora, one can benefit more from an extremely high precision PPI extraction tool than a high-recall counterpart. We also argue that the current per-instance basis performance evaluation method should be revisited. In order to address these problems, we introduce a new rulebased PPI extraction method equipped with a set of ultrahigh precision extraction rules. We also propose a new perpair basis performance metric, which is more pragmatic in practice. The proposed PPI extraction method achieves 95-96% per-pair and 94-97% per-instance precisions on the AIMed benchmark corpus.

Original languageEnglish
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
Pages69-76
Number of pages8
DOIs
Publication statusPublished - 2012 Dec 10
Event6th ACM International Workshop on Data and Text Mining in Biomedical Informatics, DTMBIO 2012, in Conjunction with the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012 - Maui, HI, United States
Duration: 2012 Oct 292012 Oct 29

Other

Other6th ACM International Workshop on Data and Text Mining in Biomedical Informatics, DTMBIO 2012, in Conjunction with the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012
CountryUnited States
CityMaui, HI
Period12/10/2912/10/29

Fingerprint

Performance evaluation
Rule-based
Benchmark
Evaluation method
Scenarios
Performance metrics

Keywords

  • Biomedical Text Mining
  • Entity Relation Extraction
  • Interaction Extraction
  • PPI
  • Text Mining

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Lee, J., Kim, S., Lee, S., Lee, K., & Kang, J. (2012). High precision rule based ppi extraction and per-pair basis performance evaluation. In International Conference on Information and Knowledge Management, Proceedings (pp. 69-76) https://doi.org/10.1145/2390068.2390082

High precision rule based ppi extraction and per-pair basis performance evaluation. / Lee, Junkyu; Kim, Seongsoon; Lee, Sunwon; Lee, Kyubum; Kang, Jaewoo.

International Conference on Information and Knowledge Management, Proceedings. 2012. p. 69-76.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, J, Kim, S, Lee, S, Lee, K & Kang, J 2012, High precision rule based ppi extraction and per-pair basis performance evaluation. in International Conference on Information and Knowledge Management, Proceedings. pp. 69-76, 6th ACM International Workshop on Data and Text Mining in Biomedical Informatics, DTMBIO 2012, in Conjunction with the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, Maui, HI, United States, 12/10/29. https://doi.org/10.1145/2390068.2390082
Lee J, Kim S, Lee S, Lee K, Kang J. High precision rule based ppi extraction and per-pair basis performance evaluation. In International Conference on Information and Knowledge Management, Proceedings. 2012. p. 69-76 https://doi.org/10.1145/2390068.2390082
Lee, Junkyu ; Kim, Seongsoon ; Lee, Sunwon ; Lee, Kyubum ; Kang, Jaewoo. / High precision rule based ppi extraction and per-pair basis performance evaluation. International Conference on Information and Knowledge Management, Proceedings. 2012. pp. 69-76
@inproceedings{23c6199f6e854ed2a7a53428e78d57a9,
title = "High precision rule based ppi extraction and per-pair basis performance evaluation",
abstract = "Virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall. However, in many realistic scenarios involving large corpora, one can benefit more from an extremely high precision PPI extraction tool than a high-recall counterpart. We also argue that the current per-instance basis performance evaluation method should be revisited. In order to address these problems, we introduce a new rulebased PPI extraction method equipped with a set of ultrahigh precision extraction rules. We also propose a new perpair basis performance metric, which is more pragmatic in practice. The proposed PPI extraction method achieves 95-96{\%} per-pair and 94-97{\%} per-instance precisions on the AIMed benchmark corpus.",
keywords = "Biomedical Text Mining, Entity Relation Extraction, Interaction Extraction, PPI, Text Mining",
author = "Junkyu Lee and Seongsoon Kim and Sunwon Lee and Kyubum Lee and Jaewoo Kang",
year = "2012",
month = "12",
day = "10",
doi = "10.1145/2390068.2390082",
language = "English",
isbn = "9781450317160",
pages = "69--76",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",

}

TY - GEN

T1 - High precision rule based ppi extraction and per-pair basis performance evaluation

AU - Lee, Junkyu

AU - Kim, Seongsoon

AU - Lee, Sunwon

AU - Lee, Kyubum

AU - Kang, Jaewoo

PY - 2012/12/10

Y1 - 2012/12/10

N2 - Virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall. However, in many realistic scenarios involving large corpora, one can benefit more from an extremely high precision PPI extraction tool than a high-recall counterpart. We also argue that the current per-instance basis performance evaluation method should be revisited. In order to address these problems, we introduce a new rulebased PPI extraction method equipped with a set of ultrahigh precision extraction rules. We also propose a new perpair basis performance metric, which is more pragmatic in practice. The proposed PPI extraction method achieves 95-96% per-pair and 94-97% per-instance precisions on the AIMed benchmark corpus.

AB - Virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall. However, in many realistic scenarios involving large corpora, one can benefit more from an extremely high precision PPI extraction tool than a high-recall counterpart. We also argue that the current per-instance basis performance evaluation method should be revisited. In order to address these problems, we introduce a new rulebased PPI extraction method equipped with a set of ultrahigh precision extraction rules. We also propose a new perpair basis performance metric, which is more pragmatic in practice. The proposed PPI extraction method achieves 95-96% per-pair and 94-97% per-instance precisions on the AIMed benchmark corpus.

KW - Biomedical Text Mining

KW - Entity Relation Extraction

KW - Interaction Extraction

KW - PPI

KW - Text Mining

UR - http://www.scopus.com/inward/record.url?scp=84870552663&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870552663&partnerID=8YFLogxK

U2 - 10.1145/2390068.2390082

DO - 10.1145/2390068.2390082

M3 - Conference contribution

AN - SCOPUS:84870552663

SN - 9781450317160

SP - 69

EP - 76

BT - International Conference on Information and Knowledge Management, Proceedings

ER -