K-NCT: Korean Neural Grammatical Error Correction Gold-Standard Test Set Using Novel Error Type Classification Criteria

Seonmin Koo, Chanjun Park, Jaehyung Seo, Seungjun Lee, Hyeonseok Moon, Jungseob Lee, Heuiseok Lim

Research output: Contribution to journalArticlepeer-review

Abstract

Recently, active research has been conducted on Korean grammatical error correction on machine translation (MT) and automatic noise generation. However, there is no gold-standard test set for objective and official comparative analysis. A significant limitation is measuring the ill-defined performance because the experimental error types in the train set are also included in the test set. Moreover, error types in the training set are also included in the test set. Additionally, the types of errors for qualitative analysis are defined differently with no explicit guidelines. This study proposes a gold-standard test set called the Korean Neural Grammatical Correction Test set (K-NCT) for Korean grammatical error correction using a new error type classification guideline. To ensure the factuality and reliability of the proposal, we conduct a quantitative analysis using a commercialization system and human evaluation. Experimental results demonstrate that the proposed grammatical error correction test set has a well-balanced, diverse, and precise guideline. Our dataset is available at https://github.com/seonminkoo/K-NCT

Original languageEnglish
Pages (from-to)118167-118175
Number of pages9
JournalIEEE Access
Volume10
DOIs
Publication statusPublished - 2022

Keywords

  • error standard
  • gold test set
  • human evaluation
  • Korean grammar correction

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'K-NCT: Korean Neural Grammatical Error Correction Gold-Standard Test Set Using Novel Error Type Classification Criteria'. Together they form a unique fingerprint.

Cite this