A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining

Donghyeon Kim, Jinhyuk Lee, Chan Ho So, Hwisang Jeon, Minbyul Jeong, Yonghwa Choi, Wonjin Yoon, Mujeen Sung, Jaewoo Kang

Research output: Contribution to journalArticle

Abstract

The amount of biomedical literature is vast and growing quickly, and accurate text mining techniques could help researchers to efficiently extract useful information from the literature. However, existing named entity recognition models used by text mining tools such as tmTool and ezTag are not effective enough, and cannot accurately discover new entities. Also, the traditional text mining tools do not consider overlapping entities, which are frequently observed in multi-type named entity recognition results. We propose a neural biomedical named entity recognition and multi-type normalization tool called BERN. The BERN uses high-performance BioBERT named entity recognition models which recognize known entities and discover new entities. Also, probability-based decision rules are developed to identify the types of overlapping entities. Furthermore, various named entity normalization models are integrated into BERN for assigning a distinct identifier to each recognized entity. The BERN provides a Web service for tagging entities in PubMed articles or raw text. Researchers can use the BERN Web service for their text mining tasks, such as new named entity discovery, information retrieval, question answering, and relation extraction. The application programming interfaces and demonstrations of BERN are publicly available at https://bern.korea.ac.kr.

Original languageEnglish
Article number8730332
Pages (from-to)73729-73740
Number of pages12
JournalIEEE Access
Volume7
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Web services
Information retrieval
Application programming interfaces (API)
Demonstrations

Keywords

  • Biomedical text mining
  • decision rules
  • multi-type
  • named entity recognition
  • neural networks
  • normalization
  • Web service

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Cite this

A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining. / Kim, Donghyeon; Lee, Jinhyuk; So, Chan Ho; Jeon, Hwisang; Jeong, Minbyul; Choi, Yonghwa; Yoon, Wonjin; Sung, Mujeen; Kang, Jaewoo.

In: IEEE Access, Vol. 7, 8730332, 01.01.2019, p. 73729-73740.

Research output: Contribution to journalArticle

Kim, D, Lee, J, So, CH, Jeon, H, Jeong, M, Choi, Y, Yoon, W, Sung, M & Kang, J 2019, 'A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining', IEEE Access, vol. 7, 8730332, pp. 73729-73740. https://doi.org/10.1109/ACCESS.2019.2920708
Kim, Donghyeon ; Lee, Jinhyuk ; So, Chan Ho ; Jeon, Hwisang ; Jeong, Minbyul ; Choi, Yonghwa ; Yoon, Wonjin ; Sung, Mujeen ; Kang, Jaewoo. / A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining. In: IEEE Access. 2019 ; Vol. 7. pp. 73729-73740.
@article{018d7f4abb08461fa0d9244c60bbb4dd,
title = "A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining",
abstract = "The amount of biomedical literature is vast and growing quickly, and accurate text mining techniques could help researchers to efficiently extract useful information from the literature. However, existing named entity recognition models used by text mining tools such as tmTool and ezTag are not effective enough, and cannot accurately discover new entities. Also, the traditional text mining tools do not consider overlapping entities, which are frequently observed in multi-type named entity recognition results. We propose a neural biomedical named entity recognition and multi-type normalization tool called BERN. The BERN uses high-performance BioBERT named entity recognition models which recognize known entities and discover new entities. Also, probability-based decision rules are developed to identify the types of overlapping entities. Furthermore, various named entity normalization models are integrated into BERN for assigning a distinct identifier to each recognized entity. The BERN provides a Web service for tagging entities in PubMed articles or raw text. Researchers can use the BERN Web service for their text mining tasks, such as new named entity discovery, information retrieval, question answering, and relation extraction. The application programming interfaces and demonstrations of BERN are publicly available at https://bern.korea.ac.kr.",
keywords = "Biomedical text mining, decision rules, multi-type, named entity recognition, neural networks, normalization, Web service",
author = "Donghyeon Kim and Jinhyuk Lee and So, {Chan Ho} and Hwisang Jeon and Minbyul Jeong and Yonghwa Choi and Wonjin Yoon and Mujeen Sung and Jaewoo Kang",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/ACCESS.2019.2920708",
language = "English",
volume = "7",
pages = "73729--73740",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining

AU - Kim, Donghyeon

AU - Lee, Jinhyuk

AU - So, Chan Ho

AU - Jeon, Hwisang

AU - Jeong, Minbyul

AU - Choi, Yonghwa

AU - Yoon, Wonjin

AU - Sung, Mujeen

AU - Kang, Jaewoo

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The amount of biomedical literature is vast and growing quickly, and accurate text mining techniques could help researchers to efficiently extract useful information from the literature. However, existing named entity recognition models used by text mining tools such as tmTool and ezTag are not effective enough, and cannot accurately discover new entities. Also, the traditional text mining tools do not consider overlapping entities, which are frequently observed in multi-type named entity recognition results. We propose a neural biomedical named entity recognition and multi-type normalization tool called BERN. The BERN uses high-performance BioBERT named entity recognition models which recognize known entities and discover new entities. Also, probability-based decision rules are developed to identify the types of overlapping entities. Furthermore, various named entity normalization models are integrated into BERN for assigning a distinct identifier to each recognized entity. The BERN provides a Web service for tagging entities in PubMed articles or raw text. Researchers can use the BERN Web service for their text mining tasks, such as new named entity discovery, information retrieval, question answering, and relation extraction. The application programming interfaces and demonstrations of BERN are publicly available at https://bern.korea.ac.kr.

AB - The amount of biomedical literature is vast and growing quickly, and accurate text mining techniques could help researchers to efficiently extract useful information from the literature. However, existing named entity recognition models used by text mining tools such as tmTool and ezTag are not effective enough, and cannot accurately discover new entities. Also, the traditional text mining tools do not consider overlapping entities, which are frequently observed in multi-type named entity recognition results. We propose a neural biomedical named entity recognition and multi-type normalization tool called BERN. The BERN uses high-performance BioBERT named entity recognition models which recognize known entities and discover new entities. Also, probability-based decision rules are developed to identify the types of overlapping entities. Furthermore, various named entity normalization models are integrated into BERN for assigning a distinct identifier to each recognized entity. The BERN provides a Web service for tagging entities in PubMed articles or raw text. Researchers can use the BERN Web service for their text mining tasks, such as new named entity discovery, information retrieval, question answering, and relation extraction. The application programming interfaces and demonstrations of BERN are publicly available at https://bern.korea.ac.kr.

KW - Biomedical text mining

KW - decision rules

KW - multi-type

KW - named entity recognition

KW - neural networks

KW - normalization

KW - Web service

UR - http://www.scopus.com/inward/record.url?scp=85068313056&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068313056&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2019.2920708

DO - 10.1109/ACCESS.2019.2920708

M3 - Article

AN - SCOPUS:85068313056

VL - 7

SP - 73729

EP - 73740

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

M1 - 8730332

ER -