Neural spelling correction: translating incorrect sentences to correct sentences for multimedia

Chanjun Park, Kuekyeng Kim, Yeong Wook Yang, Minho Kang, Heuiseok Lim

Research output: Contribution to journalArticlepeer-review

Abstract

The aim of a spelling correction task is to detect spelling errors and automatically correct them. In this paper we aim to perform the Korean spelling correction task from a machine translation perspective, allowing it to overcome the limitations of cost, time and data. Based on a sequence to sequence model, the model aligns its source sentence with an ‘error filled sentence’ and its target sentence aligned to the correct counter part. Thus, ‘translating’ the error sentence to a correct sentence. For this research, we have also proposed three new data generation methods allowing the creation of multiple spelling correction parallel corpora from just a single monolingual corpus. Additionally, we discovered that applying the Copy Mechanism not only resolves the problem of overcorrection but even improves it. For this paper, we evaluated our model upon these aspects: Performance comparisons to other models and evaluation on overcorrection. The results show the proposed model to even out-perform other systems currently in commercial use.

Original languageEnglish
JournalMultimedia Tools and Applications
DOIs
Publication statusAccepted/In press - 2020

Keywords

  • Automatic noise generation
  • Copy mechanism
  • Korean spelling correction
  • Neural machine translation
  • Overcorrection
  • Transformer

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Neural spelling correction: translating incorrect sentences to correct sentences for multimedia'. Together they form a unique fingerprint.

Cite this