BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text

Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Hyeonseok Moon, Sugyeong Eo, Heuiseok Lim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

With the growing popularity of smart speakers, such as Amazon Alexa, speech is becoming one of the most important modes of human-computer interaction. Automatic speech recognition (ASR) is arguably the most critical component of such systems, as errors in speech recognition propagate to the downstream components and drastically degrade the user experience. A simple and effective way to improve the speech recognition accuracy is to apply automatic post-processor to the recognition result. However, training a post-processor requires parallel corpora created by human annotators, which are expensive and not scalable. To alleviate this problem, we propose Back TranScription (BTS), a denoising-based method that can create such corpora without human labor. Using a raw corpus, BTS corrupts the text using Text-to-Speech (TTS) and Speech-to-Text (STT) systems. Then, a post-processing model can be trained to reconstruct the original text given the corrupted input. Quantitative and qualitative evaluations show that a post-processor trained using our approach is highly effective in fixing non-trivial speech recognition errors such as mishandling foreign words. We present the generated parallel corpus and post-processing platform to make our results publicly available.

Original languageEnglish
Title of host publicationWAT 2021 - 8th Workshop on Asian Translation, Proceedings of the Workshop
EditorsToshiaki Nakazawa, Hideki Nakayama, Isao Goto, Hideya Mino, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Shohei Higashiyama, Hiroshi Manabe, Win Pa Pa, Shantipriya Parida, Ondrej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
PublisherAssociation for Computational Linguistics (ACL)
Pages106-116
Number of pages11
ISBN (Electronic)9781954085633
Publication statusPublished - 2021
Event8th Workshop on Asian Translation, WAT 2021 - Virtual, Bangkok, Thailand
Duration: 2021 Aug 52021 Aug 6

Publication series

NameWAT 2021 - 8th Workshop on Asian Translation, Proceedings of the Workshop

Conference

Conference8th Workshop on Asian Translation, WAT 2021
Country/TerritoryThailand
CityVirtual, Bangkok
Period21/8/521/8/6

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text'. Together they form a unique fingerprint.

Cite this