Automatic extraction of HLA-disease interaction information from biomedical literature

Jeong Min Chae, Ji Eun Chae, Taemin Lee, Younghee Jung, Heungbum Oh, Soonyoung Jung

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The HLA control a variety of function involved in immune response and influence susceptibility to over 40 diseases. It is important to find out how HLA cause the disease or modify susceptibility or course of it. In this paper, we developed an automatic HLA-disease information extraction procedure that uses biomedical publications. First, HLA and diseases are recognized in the literature using built-in regular languages and disease categories of Mesh. Second, we generated parse trees for each sentence in PubMed using collins parser. Third, we build our own information extraction algorithm. The algorithm searched parsing trees and extracted relation information from sentences. We automatically collected 10,184 sentences from 66,785 PubMed abstracts using HaDextract. The precision rate of extracted relations reported 89.6% in randomly selected 144 sentences.

Original languageEnglish
Title of host publicationAdvances in Computational Science and Engineering
Subtitle of host publicationSecond International Conference, FGCN 2008, Workshops and Symposia, Sanya, Hainan Island, China, December 13-15, 2008. Revised Selected Papers
EditorsTai-hoon Kim, Laurence T. Yang, Jong Hyuk Park, Alan Chin-Chen Chang, Thanos Vasilakos, Yan Zhang, Damien Sauveron, Xingang Wang, Young-Sik Jeong
Number of pages12
Publication statusPublished - 2009

Publication series

NameCommunications in Computer and Information Science
ISSN (Print)1865-0929


  • Disease
  • HLA
  • Interaction information
  • Textmining

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)


Dive into the research topics of 'Automatic extraction of HLA-disease interaction information from biomedical literature'. Together they form a unique fingerprint.

Cite this