TY - GEN
T1 - BTS
T2 - 8th Workshop on Asian Translation, WAT 2021
AU - Park, Chanjun
AU - Seo, Jaehyung
AU - Lee, Seolhwa
AU - Lee, Chanhee
AU - Moon, Hyeonseok
AU - Eo, Sugyeong
AU - Lim, Heuiseok
N1 - Funding Information:
This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2018-0-01405) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation), Institute for Information & communications Technology Planning & Evaluation (IITP), grant funded by the Korean government (MSIT) (No. 2020-0-00368, A Neural-Symbolic Model for Knowledge Acquisition and Inference Techniques) and MSIT(Ministry of Science and ICT), Korea, under the ICT Creative Consilience program(IITP-2021-2020-0-01819) supervised by the IITP(Institute for Information & communications Technology Planning Evaluation).
Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - With the growing popularity of smart speakers, such as Amazon Alexa, speech is becoming one of the most important modes of human-computer interaction. Automatic speech recognition (ASR) is arguably the most critical component of such systems, as errors in speech recognition propagate to the downstream components and drastically degrade the user experience. A simple and effective way to improve the speech recognition accuracy is to apply automatic post-processor to the recognition result. However, training a post-processor requires parallel corpora created by human annotators, which are expensive and not scalable. To alleviate this problem, we propose Back TranScription (BTS), a denoising-based method that can create such corpora without human labor. Using a raw corpus, BTS corrupts the text using Text-to-Speech (TTS) and Speech-to-Text (STT) systems. Then, a post-processing model can be trained to reconstruct the original text given the corrupted input. Quantitative and qualitative evaluations show that a post-processor trained using our approach is highly effective in fixing non-trivial speech recognition errors such as mishandling foreign words. We present the generated parallel corpus and post-processing platform to make our results publicly available.
AB - With the growing popularity of smart speakers, such as Amazon Alexa, speech is becoming one of the most important modes of human-computer interaction. Automatic speech recognition (ASR) is arguably the most critical component of such systems, as errors in speech recognition propagate to the downstream components and drastically degrade the user experience. A simple and effective way to improve the speech recognition accuracy is to apply automatic post-processor to the recognition result. However, training a post-processor requires parallel corpora created by human annotators, which are expensive and not scalable. To alleviate this problem, we propose Back TranScription (BTS), a denoising-based method that can create such corpora without human labor. Using a raw corpus, BTS corrupts the text using Text-to-Speech (TTS) and Speech-to-Text (STT) systems. Then, a post-processing model can be trained to reconstruct the original text given the corrupted input. Quantitative and qualitative evaluations show that a post-processor trained using our approach is highly effective in fixing non-trivial speech recognition errors such as mishandling foreign words. We present the generated parallel corpus and post-processing platform to make our results publicly available.
UR - http://www.scopus.com/inward/record.url?scp=85115210038&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85115210038
T3 - WAT 2021 - 8th Workshop on Asian Translation, Proceedings of the Workshop
SP - 106
EP - 116
BT - WAT 2021 - 8th Workshop on Asian Translation, Proceedings of the Workshop
A2 - Nakazawa, Toshiaki
A2 - Nakayama, Hideki
A2 - Goto, Isao
A2 - Mino, Hideya
A2 - Ding, Chenchen
A2 - Dabre, Raj
A2 - Kunchukuttan, Anoop
A2 - Higashiyama, Shohei
A2 - Manabe, Hiroshi
A2 - Pa, Win Pa
A2 - Parida, Shantipriya
A2 - Bojar, Ondrej
A2 - Chu, Chenhui
A2 - Eriguchi, Akiko
A2 - Abe, Kaori
A2 - Oda, Yusuke
A2 - Sudoh, Katsuhito
A2 - Kurohashi, Sadao
A2 - Bhattacharyya, Pushpak
PB - Association for Computational Linguistics (ACL)
Y2 - 5 August 2021 through 6 August 2021
ER -