Can Machines Learn to Comprehend Scientific Literature?

Donghyeon Park, Yonghwa Choi, Daehan Kim, Minhwan Yu, Seongsoon Kim, Jaewoo Kang

Research output: Contribution to journalArticle

Abstract

To measure the ability of a machine to understand professional-level scientific articles, we construct a scientific question answering task called PaperQA. The PaperQA task is based on more than 80 000 'fill-in-the-blank' type questions on articles from reputed scientific journals such as Nature and Science. We perform fine-grained linguistic analysis and evaluation to compare PaperQA and other conventional question and answering (QA) tasks on general literature (e.g., books, news articles, and Wikipedia texts). The results indicate that the PaperQA task is the most difficult QA task for both humans (lay people) and machines (deep-learning models). Moreover, humans generally outperform machines in conventional QA tasks, but we found that advanced deep-learning models outperform humans by 3%-13% on average in the PaperQA task. The PaperQA dataset used in this paper is publicly available at http://dmis.korea.ac.kr/downloads?id=PaperQA.

Original languageEnglish
Article number8606080
Pages (from-to)16246-16256
Number of pages11
JournalIEEE Access
Volume7
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Linguistics
Deep learning

Keywords

  • Artificial intelligence
  • crowdsourcing
  • data acquisition
  • data analysis
  • data collection
  • data mining
  • data preprocessing
  • knowledge discovery
  • machine intelligence
  • natural language processing
  • social computing
  • text analysis
  • text mining

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Cite this

Park, D., Choi, Y., Kim, D., Yu, M., Kim, S., & Kang, J. (2019). Can Machines Learn to Comprehend Scientific Literature? IEEE Access, 7, 16246-16256. [8606080]. https://doi.org/10.1109/ACCESS.2019.2891666

Can Machines Learn to Comprehend Scientific Literature? / Park, Donghyeon; Choi, Yonghwa; Kim, Daehan; Yu, Minhwan; Kim, Seongsoon; Kang, Jaewoo.

In: IEEE Access, Vol. 7, 8606080, 01.01.2019, p. 16246-16256.

Research output: Contribution to journalArticle

Park, D, Choi, Y, Kim, D, Yu, M, Kim, S & Kang, J 2019, 'Can Machines Learn to Comprehend Scientific Literature?', IEEE Access, vol. 7, 8606080, pp. 16246-16256. https://doi.org/10.1109/ACCESS.2019.2891666
Park, Donghyeon ; Choi, Yonghwa ; Kim, Daehan ; Yu, Minhwan ; Kim, Seongsoon ; Kang, Jaewoo. / Can Machines Learn to Comprehend Scientific Literature?. In: IEEE Access. 2019 ; Vol. 7. pp. 16246-16256.
@article{781d58e4b35d41048f4278945f3a7795,
title = "Can Machines Learn to Comprehend Scientific Literature?",
abstract = "To measure the ability of a machine to understand professional-level scientific articles, we construct a scientific question answering task called PaperQA. The PaperQA task is based on more than 80 000 'fill-in-the-blank' type questions on articles from reputed scientific journals such as Nature and Science. We perform fine-grained linguistic analysis and evaluation to compare PaperQA and other conventional question and answering (QA) tasks on general literature (e.g., books, news articles, and Wikipedia texts). The results indicate that the PaperQA task is the most difficult QA task for both humans (lay people) and machines (deep-learning models). Moreover, humans generally outperform machines in conventional QA tasks, but we found that advanced deep-learning models outperform humans by 3{\%}-13{\%} on average in the PaperQA task. The PaperQA dataset used in this paper is publicly available at http://dmis.korea.ac.kr/downloads?id=PaperQA.",
keywords = "Artificial intelligence, crowdsourcing, data acquisition, data analysis, data collection, data mining, data preprocessing, knowledge discovery, machine intelligence, natural language processing, social computing, text analysis, text mining",
author = "Donghyeon Park and Yonghwa Choi and Daehan Kim and Minhwan Yu and Seongsoon Kim and Jaewoo Kang",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/ACCESS.2019.2891666",
language = "English",
volume = "7",
pages = "16246--16256",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Can Machines Learn to Comprehend Scientific Literature?

AU - Park, Donghyeon

AU - Choi, Yonghwa

AU - Kim, Daehan

AU - Yu, Minhwan

AU - Kim, Seongsoon

AU - Kang, Jaewoo

PY - 2019/1/1

Y1 - 2019/1/1

N2 - To measure the ability of a machine to understand professional-level scientific articles, we construct a scientific question answering task called PaperQA. The PaperQA task is based on more than 80 000 'fill-in-the-blank' type questions on articles from reputed scientific journals such as Nature and Science. We perform fine-grained linguistic analysis and evaluation to compare PaperQA and other conventional question and answering (QA) tasks on general literature (e.g., books, news articles, and Wikipedia texts). The results indicate that the PaperQA task is the most difficult QA task for both humans (lay people) and machines (deep-learning models). Moreover, humans generally outperform machines in conventional QA tasks, but we found that advanced deep-learning models outperform humans by 3%-13% on average in the PaperQA task. The PaperQA dataset used in this paper is publicly available at http://dmis.korea.ac.kr/downloads?id=PaperQA.

AB - To measure the ability of a machine to understand professional-level scientific articles, we construct a scientific question answering task called PaperQA. The PaperQA task is based on more than 80 000 'fill-in-the-blank' type questions on articles from reputed scientific journals such as Nature and Science. We perform fine-grained linguistic analysis and evaluation to compare PaperQA and other conventional question and answering (QA) tasks on general literature (e.g., books, news articles, and Wikipedia texts). The results indicate that the PaperQA task is the most difficult QA task for both humans (lay people) and machines (deep-learning models). Moreover, humans generally outperform machines in conventional QA tasks, but we found that advanced deep-learning models outperform humans by 3%-13% on average in the PaperQA task. The PaperQA dataset used in this paper is publicly available at http://dmis.korea.ac.kr/downloads?id=PaperQA.

KW - Artificial intelligence

KW - crowdsourcing

KW - data acquisition

KW - data analysis

KW - data collection

KW - data mining

KW - data preprocessing

KW - knowledge discovery

KW - machine intelligence

KW - natural language processing

KW - social computing

KW - text analysis

KW - text mining

UR - http://www.scopus.com/inward/record.url?scp=85061800643&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061800643&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2019.2891666

DO - 10.1109/ACCESS.2019.2891666

M3 - Article

AN - SCOPUS:85061800643

VL - 7

SP - 16246

EP - 16256

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

M1 - 8606080

ER -