Comparing measurement errors for formants in synthetic and natural vowels

Christine H. Shadle, Hosung Nam, D. H. Whalen

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

The measurement of formant frequencies of vowels is among the most common measurements in speech studies, but measurements are known to be biased by the particular fundamental frequency (F0) exciting the formants. Approaches to reducing the errors were assessed in two experiments. In the first, synthetic vowels were constructed with five different first formant (F1) values and nine different F0 values; formant bandwidths, and higher formant frequencies, were constant. Input formant values were compared to manual measurements and automatic measures using the linear prediction coding-Burg algorithm, linear prediction closed-phase covariance, the weighted linear prediction-attenuated main excitation (WLP-AME) algorithm [Alku, Pohjalainen, Vainio, Laukkanen, and Story (2013). J. Acoust. Soc. Am. 134(2), 1295-1313], spectra smoothed cepstrally and by averaging repeated discrete Fourier transforms. Formants were also measured manually from pruned reassigned spectrograms (RSs) [Fulop (2011). Speech Spectrum Analysis (Springer, Berlin)]. All but WLP-AME and RS had large errors in the direction of the strongest harmonic; the smallest errors occur with WLP-AME and RS. In the second experiment, these methods were used on vowels in isolated words spoken by four speakers. Results for the natural speech show that F0 bias affects all automatic methods, including WLP-AME; only the formants measured manually from RS appeared to be accurate. In addition, RS coped better with weaker formants and glottal fry.

Original languageEnglish
Pages (from-to)713-727
Number of pages15
JournalJournal of the Acoustical Society of America
Volume139
Issue number2
DOIs
Publication statusPublished - 2016 Feb 1

Fingerprint

linear prediction
vowels
spectrograms
excitation
spectrum analysis
Measurement Error
Formants
coding
Prediction
bandwidth
harmonics

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Cite this

Comparing measurement errors for formants in synthetic and natural vowels. / Shadle, Christine H.; Nam, Hosung; Whalen, D. H.

In: Journal of the Acoustical Society of America, Vol. 139, No. 2, 01.02.2016, p. 713-727.

Research output: Contribution to journalArticle

@article{6a8eb68519074fe887c51275184e0ba5,
title = "Comparing measurement errors for formants in synthetic and natural vowels",
abstract = "The measurement of formant frequencies of vowels is among the most common measurements in speech studies, but measurements are known to be biased by the particular fundamental frequency (F0) exciting the formants. Approaches to reducing the errors were assessed in two experiments. In the first, synthetic vowels were constructed with five different first formant (F1) values and nine different F0 values; formant bandwidths, and higher formant frequencies, were constant. Input formant values were compared to manual measurements and automatic measures using the linear prediction coding-Burg algorithm, linear prediction closed-phase covariance, the weighted linear prediction-attenuated main excitation (WLP-AME) algorithm [Alku, Pohjalainen, Vainio, Laukkanen, and Story (2013). J. Acoust. Soc. Am. 134(2), 1295-1313], spectra smoothed cepstrally and by averaging repeated discrete Fourier transforms. Formants were also measured manually from pruned reassigned spectrograms (RSs) [Fulop (2011). Speech Spectrum Analysis (Springer, Berlin)]. All but WLP-AME and RS had large errors in the direction of the strongest harmonic; the smallest errors occur with WLP-AME and RS. In the second experiment, these methods were used on vowels in isolated words spoken by four speakers. Results for the natural speech show that F0 bias affects all automatic methods, including WLP-AME; only the formants measured manually from RS appeared to be accurate. In addition, RS coped better with weaker formants and glottal fry.",
author = "Shadle, {Christine H.} and Hosung Nam and Whalen, {D. H.}",
year = "2016",
month = "2",
day = "1",
doi = "10.1121/1.4940665",
language = "English",
volume = "139",
pages = "713--727",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "2",

}

TY - JOUR

T1 - Comparing measurement errors for formants in synthetic and natural vowels

AU - Shadle, Christine H.

AU - Nam, Hosung

AU - Whalen, D. H.

PY - 2016/2/1

Y1 - 2016/2/1

N2 - The measurement of formant frequencies of vowels is among the most common measurements in speech studies, but measurements are known to be biased by the particular fundamental frequency (F0) exciting the formants. Approaches to reducing the errors were assessed in two experiments. In the first, synthetic vowels were constructed with five different first formant (F1) values and nine different F0 values; formant bandwidths, and higher formant frequencies, were constant. Input formant values were compared to manual measurements and automatic measures using the linear prediction coding-Burg algorithm, linear prediction closed-phase covariance, the weighted linear prediction-attenuated main excitation (WLP-AME) algorithm [Alku, Pohjalainen, Vainio, Laukkanen, and Story (2013). J. Acoust. Soc. Am. 134(2), 1295-1313], spectra smoothed cepstrally and by averaging repeated discrete Fourier transforms. Formants were also measured manually from pruned reassigned spectrograms (RSs) [Fulop (2011). Speech Spectrum Analysis (Springer, Berlin)]. All but WLP-AME and RS had large errors in the direction of the strongest harmonic; the smallest errors occur with WLP-AME and RS. In the second experiment, these methods were used on vowels in isolated words spoken by four speakers. Results for the natural speech show that F0 bias affects all automatic methods, including WLP-AME; only the formants measured manually from RS appeared to be accurate. In addition, RS coped better with weaker formants and glottal fry.

AB - The measurement of formant frequencies of vowels is among the most common measurements in speech studies, but measurements are known to be biased by the particular fundamental frequency (F0) exciting the formants. Approaches to reducing the errors were assessed in two experiments. In the first, synthetic vowels were constructed with five different first formant (F1) values and nine different F0 values; formant bandwidths, and higher formant frequencies, were constant. Input formant values were compared to manual measurements and automatic measures using the linear prediction coding-Burg algorithm, linear prediction closed-phase covariance, the weighted linear prediction-attenuated main excitation (WLP-AME) algorithm [Alku, Pohjalainen, Vainio, Laukkanen, and Story (2013). J. Acoust. Soc. Am. 134(2), 1295-1313], spectra smoothed cepstrally and by averaging repeated discrete Fourier transforms. Formants were also measured manually from pruned reassigned spectrograms (RSs) [Fulop (2011). Speech Spectrum Analysis (Springer, Berlin)]. All but WLP-AME and RS had large errors in the direction of the strongest harmonic; the smallest errors occur with WLP-AME and RS. In the second experiment, these methods were used on vowels in isolated words spoken by four speakers. Results for the natural speech show that F0 bias affects all automatic methods, including WLP-AME; only the formants measured manually from RS appeared to be accurate. In addition, RS coped better with weaker formants and glottal fry.

UR - http://www.scopus.com/inward/record.url?scp=84958787236&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958787236&partnerID=8YFLogxK

U2 - 10.1121/1.4940665

DO - 10.1121/1.4940665

M3 - Article

C2 - 26936555

AN - SCOPUS:84958787236

VL - 139

SP - 713

EP - 727

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 2

ER -