Automatic extraction of HLA-disease interaction information from biomedical literature

JeongMin Chae, JiEun Chae, Taemin Lee, Younghee Jung, Heungbum Oh, Soon Young Jung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The HLA control a variety of function involved in immune response and influence susceptibility to over 40 diseases. It is important to find out how HLA cause the disease or modify susceptibility or course of it. In this paper, we developed an automatic HLA-disease information extraction procedure that uses biomedical publications. First, HLA and diseases are recognized in the literature using built-in regular languages and disease categories of Mesh. Second, we generated parse trees for each sentence in PubMed using collins parser. Third, we build our own information extraction algorithm. The algorithm searched parsing trees and extracted relation information from sentences. We automatically collected 10,184 sentences from 66,785 PubMed abstracts using HaDextract. The precision rate of extracted relations reported 89.6% in randomly selected 144 sentences.

Original languageEnglish
Title of host publicationCommunications in Computer and Information Science
Pages219-230
Number of pages12
Volume28
DOIs
Publication statusPublished - 2009 Dec 1

Publication series

NameCommunications in Computer and Information Science
Volume28
ISSN (Print)18650929

Fingerprint

Formal languages
Trees (mathematics)

Keywords

  • Disease
  • HLA
  • Interaction information
  • Textmining

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Chae, J., Chae, J., Lee, T., Jung, Y., Oh, H., & Jung, S. Y. (2009). Automatic extraction of HLA-disease interaction information from biomedical literature. In Communications in Computer and Information Science (Vol. 28, pp. 219-230). (Communications in Computer and Information Science; Vol. 28). https://doi.org/10.1007/978-3-642-10238-7_18

Automatic extraction of HLA-disease interaction information from biomedical literature. / Chae, JeongMin; Chae, JiEun; Lee, Taemin; Jung, Younghee; Oh, Heungbum; Jung, Soon Young.

Communications in Computer and Information Science. Vol. 28 2009. p. 219-230 (Communications in Computer and Information Science; Vol. 28).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chae, J, Chae, J, Lee, T, Jung, Y, Oh, H & Jung, SY 2009, Automatic extraction of HLA-disease interaction information from biomedical literature. in Communications in Computer and Information Science. vol. 28, Communications in Computer and Information Science, vol. 28, pp. 219-230. https://doi.org/10.1007/978-3-642-10238-7_18
Chae J, Chae J, Lee T, Jung Y, Oh H, Jung SY. Automatic extraction of HLA-disease interaction information from biomedical literature. In Communications in Computer and Information Science. Vol. 28. 2009. p. 219-230. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-642-10238-7_18
Chae, JeongMin ; Chae, JiEun ; Lee, Taemin ; Jung, Younghee ; Oh, Heungbum ; Jung, Soon Young. / Automatic extraction of HLA-disease interaction information from biomedical literature. Communications in Computer and Information Science. Vol. 28 2009. pp. 219-230 (Communications in Computer and Information Science).
@inproceedings{44542134edf94cf8a12ed2c3f3f9fc32,
title = "Automatic extraction of HLA-disease interaction information from biomedical literature",
abstract = "The HLA control a variety of function involved in immune response and influence susceptibility to over 40 diseases. It is important to find out how HLA cause the disease or modify susceptibility or course of it. In this paper, we developed an automatic HLA-disease information extraction procedure that uses biomedical publications. First, HLA and diseases are recognized in the literature using built-in regular languages and disease categories of Mesh. Second, we generated parse trees for each sentence in PubMed using collins parser. Third, we build our own information extraction algorithm. The algorithm searched parsing trees and extracted relation information from sentences. We automatically collected 10,184 sentences from 66,785 PubMed abstracts using HaDextract. The precision rate of extracted relations reported 89.6{\%} in randomly selected 144 sentences.",
keywords = "Disease, HLA, Interaction information, Textmining",
author = "JeongMin Chae and JiEun Chae and Taemin Lee and Younghee Jung and Heungbum Oh and Jung, {Soon Young}",
year = "2009",
month = "12",
day = "1",
doi = "10.1007/978-3-642-10238-7_18",
language = "English",
isbn = "9783642102370",
volume = "28",
series = "Communications in Computer and Information Science",
pages = "219--230",
booktitle = "Communications in Computer and Information Science",

}

TY - GEN

T1 - Automatic extraction of HLA-disease interaction information from biomedical literature

AU - Chae, JeongMin

AU - Chae, JiEun

AU - Lee, Taemin

AU - Jung, Younghee

AU - Oh, Heungbum

AU - Jung, Soon Young

PY - 2009/12/1

Y1 - 2009/12/1

N2 - The HLA control a variety of function involved in immune response and influence susceptibility to over 40 diseases. It is important to find out how HLA cause the disease or modify susceptibility or course of it. In this paper, we developed an automatic HLA-disease information extraction procedure that uses biomedical publications. First, HLA and diseases are recognized in the literature using built-in regular languages and disease categories of Mesh. Second, we generated parse trees for each sentence in PubMed using collins parser. Third, we build our own information extraction algorithm. The algorithm searched parsing trees and extracted relation information from sentences. We automatically collected 10,184 sentences from 66,785 PubMed abstracts using HaDextract. The precision rate of extracted relations reported 89.6% in randomly selected 144 sentences.

AB - The HLA control a variety of function involved in immune response and influence susceptibility to over 40 diseases. It is important to find out how HLA cause the disease or modify susceptibility or course of it. In this paper, we developed an automatic HLA-disease information extraction procedure that uses biomedical publications. First, HLA and diseases are recognized in the literature using built-in regular languages and disease categories of Mesh. Second, we generated parse trees for each sentence in PubMed using collins parser. Third, we build our own information extraction algorithm. The algorithm searched parsing trees and extracted relation information from sentences. We automatically collected 10,184 sentences from 66,785 PubMed abstracts using HaDextract. The precision rate of extracted relations reported 89.6% in randomly selected 144 sentences.

KW - Disease

KW - HLA

KW - Interaction information

KW - Textmining

UR - http://www.scopus.com/inward/record.url?scp=73349111271&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=73349111271&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-10238-7_18

DO - 10.1007/978-3-642-10238-7_18

M3 - Conference contribution

AN - SCOPUS:73349111271

SN - 9783642102370

VL - 28

T3 - Communications in Computer and Information Science

SP - 219

EP - 230

BT - Communications in Computer and Information Science

ER -