TY - JOUR
T1 - KU-DMIS at BioASQ 9
T2 - 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021
AU - Yoon, Wonjin
AU - Yoo, Jaehyo
AU - Seo, Sumin
AU - Sung, Mujeen
AU - Jeong, Minbyul
AU - Kim, Gangwoo
AU - Kang, Jaewoo
N1 - Funding Information:
We express gratitude towards Dr. Jihye Kim and Dr. Sungjoon Park from Korea University for their invaluable insight into our systems’ output. This research is supported by National Research Foundation of Korea (NRF-2020R1A2C3010638) and a grant of the the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HR20C0021)
Funding Information:
We express gratitude towards Dr. Jihye Kim and Dr. Sungjoon Park from Korea University for their invaluable insight into our systems' output. This research is supported by National Research Foundation of Korea (NRF-2020R1A2C3010638) and a grant of the the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HR20C0021)
Publisher Copyright:
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2021
Y1 - 2021
N2 - In this paper, we present approaches for our participation in the 9th BioASQ challenge (Task b - Phase B). Our systems are based on the transformer models with model-centric and data-centric approaches. For factoid-type questions we modified the dataset to increase label consistency, and for list-type questions we apply the sequence tagging model which is a more natural model design for the multi-label task. Our experimental results suggest two main points: better model design can be achieved by reflecting data characteristics such as the number of labels for a data point; and scarce resources such as BioQA datasets can greatly benefit from a data-centric approach with relatively little effort. Our submissions achieve competitive results with top or near top performance in the challenge.
AB - In this paper, we present approaches for our participation in the 9th BioASQ challenge (Task b - Phase B). Our systems are based on the transformer models with model-centric and data-centric approaches. For factoid-type questions we modified the dataset to increase label consistency, and for list-type questions we apply the sequence tagging model which is a more natural model design for the multi-label task. Our experimental results suggest two main points: better model design can be achieved by reflecting data characteristics such as the number of labels for a data point; and scarce resources such as BioQA datasets can greatly benefit from a data-centric approach with relatively little effort. Our submissions achieve competitive results with top or near top performance in the challenge.
KW - BioASQ
KW - Biomedical natural language processing
KW - Biomedical question answering
KW - BioNLP
UR - http://www.scopus.com/inward/record.url?scp=85113453370&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85113453370
VL - 2936
SP - 351
EP - 359
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
SN - 1613-0073
Y2 - 21 September 2021 through 24 September 2021
ER -