Sequence tagging for biomedical extractive question answering

Wonjin Yoon, Richard Jackson, Aron Lagerberg, Jaewoo Kang

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Motivation: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps. Results: In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps.

Original languageEnglish
Pages (from-to)3794-3801
Number of pages8
JournalBioinformatics
Volume38
Issue number15
DOIs
Publication statusPublished - 2022 Aug 1

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Sequence tagging for biomedical extractive question answering'. Together they form a unique fingerprint.

Cite this