Pronunciation similarity estimation for spoken language learning

Donghyun Kim, Dongsuk Yook

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents an approach for estimating pronunciation similarity between two speakers using the cepstral distance. General speech recognition systems have been used to find the matched words of a speaker, using the acoustical score of a speech signal and the grammatical score of a word sequence. In the case of learning a language, for a speaker with impaired hearing, it is not easy to estimate the pronunciation similarity using automatic speech recognition systems, as this requires more information of pronouncing characteristics, than information on word matching. This is a new challenge for computer aided pronunciation learning. The dynamic time warping algorithm is used for cepstral distance computation between two speech data with codebook distance subtracted to consider the characteristics of each speaker. The experiments evaluated on the Korean fundamental vowel set show that the similarity of two speaker's pronunciation can be efficiently computed using computers.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages442-449
Number of pages8
Volume4285 LNAI
DOIs
Publication statusPublished - 2006 Dec 1
Event21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006 - Singapore, Singapore
Duration: 2006 Dec 172006 Dec 19

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4285 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006
CountrySingapore
CitySingapore
Period06/12/1706/12/19

Fingerprint

Speech recognition
Audition
Dynamic Time Warping
Automatic Speech Recognition
Codebook
Speech Signal
Speech Recognition
Experiments
Estimate
Experiment
Language Acquisition
Similarity
Learning

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Kim, D., & Yook, D. (2006). Pronunciation similarity estimation for spoken language learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4285 LNAI, pp. 442-449). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4285 LNAI). https://doi.org/10.1007/11940098_46

Pronunciation similarity estimation for spoken language learning. / Kim, Donghyun; Yook, Dongsuk.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI 2006. p. 442-449 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4285 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, D & Yook, D 2006, Pronunciation similarity estimation for spoken language learning. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4285 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4285 LNAI, pp. 442-449, 21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006, Singapore, Singapore, 06/12/17. https://doi.org/10.1007/11940098_46
Kim D, Yook D. Pronunciation similarity estimation for spoken language learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI. 2006. p. 442-449. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11940098_46
Kim, Donghyun ; Yook, Dongsuk. / Pronunciation similarity estimation for spoken language learning. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI 2006. pp. 442-449 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{1724f734221b4f0ea0644e7d15317b27,
title = "Pronunciation similarity estimation for spoken language learning",
abstract = "This paper presents an approach for estimating pronunciation similarity between two speakers using the cepstral distance. General speech recognition systems have been used to find the matched words of a speaker, using the acoustical score of a speech signal and the grammatical score of a word sequence. In the case of learning a language, for a speaker with impaired hearing, it is not easy to estimate the pronunciation similarity using automatic speech recognition systems, as this requires more information of pronouncing characteristics, than information on word matching. This is a new challenge for computer aided pronunciation learning. The dynamic time warping algorithm is used for cepstral distance computation between two speech data with codebook distance subtracted to consider the characteristics of each speaker. The experiments evaluated on the Korean fundamental vowel set show that the similarity of two speaker's pronunciation can be efficiently computed using computers.",
author = "Donghyun Kim and Dongsuk Yook",
year = "2006",
month = "12",
day = "1",
doi = "10.1007/11940098_46",
language = "English",
isbn = "354049667X",
volume = "4285 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "442--449",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Pronunciation similarity estimation for spoken language learning

AU - Kim, Donghyun

AU - Yook, Dongsuk

PY - 2006/12/1

Y1 - 2006/12/1

N2 - This paper presents an approach for estimating pronunciation similarity between two speakers using the cepstral distance. General speech recognition systems have been used to find the matched words of a speaker, using the acoustical score of a speech signal and the grammatical score of a word sequence. In the case of learning a language, for a speaker with impaired hearing, it is not easy to estimate the pronunciation similarity using automatic speech recognition systems, as this requires more information of pronouncing characteristics, than information on word matching. This is a new challenge for computer aided pronunciation learning. The dynamic time warping algorithm is used for cepstral distance computation between two speech data with codebook distance subtracted to consider the characteristics of each speaker. The experiments evaluated on the Korean fundamental vowel set show that the similarity of two speaker's pronunciation can be efficiently computed using computers.

AB - This paper presents an approach for estimating pronunciation similarity between two speakers using the cepstral distance. General speech recognition systems have been used to find the matched words of a speaker, using the acoustical score of a speech signal and the grammatical score of a word sequence. In the case of learning a language, for a speaker with impaired hearing, it is not easy to estimate the pronunciation similarity using automatic speech recognition systems, as this requires more information of pronouncing characteristics, than information on word matching. This is a new challenge for computer aided pronunciation learning. The dynamic time warping algorithm is used for cepstral distance computation between two speech data with codebook distance subtracted to consider the characteristics of each speaker. The experiments evaluated on the Korean fundamental vowel set show that the similarity of two speaker's pronunciation can be efficiently computed using computers.

UR - http://www.scopus.com/inward/record.url?scp=77049118749&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77049118749&partnerID=8YFLogxK

U2 - 10.1007/11940098_46

DO - 10.1007/11940098_46

M3 - Conference contribution

AN - SCOPUS:77049118749

SN - 354049667X

SN - 9783540496670

VL - 4285 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 442

EP - 449

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -