From acoustics to vocal tract time functions

Vikramjit Mitra, I. Yücel Özbek, Hosung Nam, Xinhui Zhou, Carol Y. Espy-Wilson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

In this paper we present a technique for obtaining Vocal Tract (VT) time functions from the acoustic speech signal. Knowledge-based Acoustic Parameters (APs) are extracted from the speech signal and a pertinent subset is used to obtain the mapping between them and the VT time functions. Eight different vocal tract constriction variables consisting of five constriction degree variables, lip aperture (LA), tongue body (TBCD), tongue tip (TTCD), velum (VEL), and glottis (GLO); and three constriction location variables, lip protrusion (LP), tongue tip (TTCL), tongue body (TBCL) were considered in this study. The TAsk Dynamics Application model (TADA [1]) is used to create a synthetic speech dataset along with its corresponding VT time functions. We explore Support Vector Regression (SVR) followed by Kalman smoothing to achieve mapping between the APs and the VT time functions.

Original languageEnglish
Title of host publication2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009
Pages4497-4500
Number of pages4
DOIs
Publication statusPublished - 2009 Sep 23
Externally publishedYes
Event2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009 - Taipei, Taiwan, Province of China
Duration: 2009 Apr 192009 Apr 24

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009
CountryTaiwan, Province of China
CityTaipei
Period09/4/1909/4/24

Fingerprint

Acoustics

Keywords

  • Acoustic-to-articulatory inversion
  • Speech inversion
  • Support vector regression
  • Vocal tract time functions

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Mitra, V., Özbek, I. Y., Nam, H., Zhou, X., & Espy-Wilson, C. Y. (2009). From acoustics to vocal tract time functions. In 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009 (pp. 4497-4500). [4960629] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2009.4960629

From acoustics to vocal tract time functions. / Mitra, Vikramjit; Özbek, I. Yücel; Nam, Hosung; Zhou, Xinhui; Espy-Wilson, Carol Y.

2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009. 2009. p. 4497-4500 4960629 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mitra, V, Özbek, IY, Nam, H, Zhou, X & Espy-Wilson, CY 2009, From acoustics to vocal tract time functions. in 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009., 4960629, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 4497-4500, 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, Taipei, Taiwan, Province of China, 09/4/19. https://doi.org/10.1109/ICASSP.2009.4960629
Mitra V, Özbek IY, Nam H, Zhou X, Espy-Wilson CY. From acoustics to vocal tract time functions. In 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009. 2009. p. 4497-4500. 4960629. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2009.4960629
Mitra, Vikramjit ; Özbek, I. Yücel ; Nam, Hosung ; Zhou, Xinhui ; Espy-Wilson, Carol Y. / From acoustics to vocal tract time functions. 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009. 2009. pp. 4497-4500 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{83d4f3a29f0b437ea778f0fa35754440,
title = "From acoustics to vocal tract time functions",
abstract = "In this paper we present a technique for obtaining Vocal Tract (VT) time functions from the acoustic speech signal. Knowledge-based Acoustic Parameters (APs) are extracted from the speech signal and a pertinent subset is used to obtain the mapping between them and the VT time functions. Eight different vocal tract constriction variables consisting of five constriction degree variables, lip aperture (LA), tongue body (TBCD), tongue tip (TTCD), velum (VEL), and glottis (GLO); and three constriction location variables, lip protrusion (LP), tongue tip (TTCL), tongue body (TBCL) were considered in this study. The TAsk Dynamics Application model (TADA [1]) is used to create a synthetic speech dataset along with its corresponding VT time functions. We explore Support Vector Regression (SVR) followed by Kalman smoothing to achieve mapping between the APs and the VT time functions.",
keywords = "Acoustic-to-articulatory inversion, Speech inversion, Support vector regression, Vocal tract time functions",
author = "Vikramjit Mitra and {\"O}zbek, {I. Y{\"u}cel} and Hosung Nam and Xinhui Zhou and Espy-Wilson, {Carol Y.}",
year = "2009",
month = "9",
day = "23",
doi = "10.1109/ICASSP.2009.4960629",
language = "English",
isbn = "9781424423545",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
pages = "4497--4500",
booktitle = "2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009",

}

TY - GEN

T1 - From acoustics to vocal tract time functions

AU - Mitra, Vikramjit

AU - Özbek, I. Yücel

AU - Nam, Hosung

AU - Zhou, Xinhui

AU - Espy-Wilson, Carol Y.

PY - 2009/9/23

Y1 - 2009/9/23

N2 - In this paper we present a technique for obtaining Vocal Tract (VT) time functions from the acoustic speech signal. Knowledge-based Acoustic Parameters (APs) are extracted from the speech signal and a pertinent subset is used to obtain the mapping between them and the VT time functions. Eight different vocal tract constriction variables consisting of five constriction degree variables, lip aperture (LA), tongue body (TBCD), tongue tip (TTCD), velum (VEL), and glottis (GLO); and three constriction location variables, lip protrusion (LP), tongue tip (TTCL), tongue body (TBCL) were considered in this study. The TAsk Dynamics Application model (TADA [1]) is used to create a synthetic speech dataset along with its corresponding VT time functions. We explore Support Vector Regression (SVR) followed by Kalman smoothing to achieve mapping between the APs and the VT time functions.

AB - In this paper we present a technique for obtaining Vocal Tract (VT) time functions from the acoustic speech signal. Knowledge-based Acoustic Parameters (APs) are extracted from the speech signal and a pertinent subset is used to obtain the mapping between them and the VT time functions. Eight different vocal tract constriction variables consisting of five constriction degree variables, lip aperture (LA), tongue body (TBCD), tongue tip (TTCD), velum (VEL), and glottis (GLO); and three constriction location variables, lip protrusion (LP), tongue tip (TTCL), tongue body (TBCL) were considered in this study. The TAsk Dynamics Application model (TADA [1]) is used to create a synthetic speech dataset along with its corresponding VT time functions. We explore Support Vector Regression (SVR) followed by Kalman smoothing to achieve mapping between the APs and the VT time functions.

KW - Acoustic-to-articulatory inversion

KW - Speech inversion

KW - Support vector regression

KW - Vocal tract time functions

UR - http://www.scopus.com/inward/record.url?scp=70349213974&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349213974&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2009.4960629

DO - 10.1109/ICASSP.2009.4960629

M3 - Conference contribution

AN - SCOPUS:70349213974

SN - 9781424423545

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 4497

EP - 4500

BT - 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009

ER -