Using automatic alignment to analyze endangered language data: Testing the viability of untrained alignment

Christian Dicanio, Hosung Nam, Douglas H. Whalen, H. Timothy Bunnell, Jonathan D. Amith, Rey Castillo García

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yoloxóchitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for hmalign and 65.7% within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.

Original languageEnglish
Pages (from-to)2235-2246
Number of pages12
JournalJournal of the Acoustical Society of America
Volume134
Issue number3
DOIs
Publication statusPublished - 2013 Sep 1
Externally publishedYes

Fingerprint

viability
alignment
phonetics
documentation
education
Endangered Languages
Alignment
Testing
Phone
Language
Segmentation

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Cite this

Using automatic alignment to analyze endangered language data : Testing the viability of untrained alignment. / Dicanio, Christian; Nam, Hosung; Whalen, Douglas H.; Timothy Bunnell, H.; Amith, Jonathan D.; García, Rey Castillo.

In: Journal of the Acoustical Society of America, Vol. 134, No. 3, 01.09.2013, p. 2235-2246.

Research output: Contribution to journalArticle

Dicanio, Christian ; Nam, Hosung ; Whalen, Douglas H. ; Timothy Bunnell, H. ; Amith, Jonathan D. ; García, Rey Castillo. / Using automatic alignment to analyze endangered language data : Testing the viability of untrained alignment. In: Journal of the Acoustical Society of America. 2013 ; Vol. 134, No. 3. pp. 2235-2246.
@article{052ae878c8194d63a03505d67d1ab899,
title = "Using automatic alignment to analyze endangered language data: Testing the viability of untrained alignment",
abstract = "While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yolox{\'o}chitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9{\%} within 30 ms for hmalign and 65.7{\%} within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.",
author = "Christian Dicanio and Hosung Nam and Whalen, {Douglas H.} and {Timothy Bunnell}, H. and Amith, {Jonathan D.} and Garc{\'i}a, {Rey Castillo}",
year = "2013",
month = "9",
day = "1",
doi = "10.1121/1.4816491",
language = "English",
volume = "134",
pages = "2235--2246",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "3",

}

TY - JOUR

T1 - Using automatic alignment to analyze endangered language data

T2 - Testing the viability of untrained alignment

AU - Dicanio, Christian

AU - Nam, Hosung

AU - Whalen, Douglas H.

AU - Timothy Bunnell, H.

AU - Amith, Jonathan D.

AU - García, Rey Castillo

PY - 2013/9/1

Y1 - 2013/9/1

N2 - While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yoloxóchitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for hmalign and 65.7% within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.

AB - While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yoloxóchitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for hmalign and 65.7% within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.

UR - http://www.scopus.com/inward/record.url?scp=84883392022&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883392022&partnerID=8YFLogxK

U2 - 10.1121/1.4816491

DO - 10.1121/1.4816491

M3 - Article

C2 - 23967953

AN - SCOPUS:84883392022

VL - 134

SP - 2235

EP - 2246

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 3

ER -