Design of audio-visual interface for aiding driver's voice commands in automotive environment

Kihyeon Kim, Changwon Jeon, Junho Park, Seokyeong Jeong, David K. Han, Hanseok Ko

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter describes an information-modeling and integration of an embedded audio-visual speech recognition system, aimed at improving speech recognition under adverse automobile noisy environment. In particular, we employ lip-reading as an added feature for enhanced speech recognition. Lip motion feature is extracted by active shape models and the corresponding hidden Markov models are constructed for lip-readinglip-reading. For realizing efficient hidden Markov models, tied-mixture technique is introduced for both visual and acoustical information. It makes the model structure simple and small while maintaining suitable recognition performance. In decoding process, the audio-visual information is integrated into the state output probabilities of hidden Markov model as multistream featuresmultistream features. Each stream is weighted according to the signal-to-noise ratio so that the visual information becomes more dominant under adverse noisy environment of an automobile. Representative experimental results demonstrate that the audio-visual speech recognition system achieves promising performance in adverse noisy condition, making it suitable for embedded devices.

Original languageEnglish
Title of host publicationIn-Vehicle Corpus and Signal Processing for Driver Behavior
PublisherSpringer US
Pages211-219
Number of pages9
ISBN (Print)9780387795812
DOIs
Publication statusPublished - 2009

Keywords

  • Active shape model
  • Audio-visual speech interface
  • Automatic speech recognition
  • Hybrid integration
  • Lip-reading
  • Mel-frequency cepstrum coefficients
  • Mouth model
  • Multistream features
  • SNR-dependent audio-visual information combination
  • Tied-mixture hidden Markov model

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Design of audio-visual interface for aiding driver's voice commands in automotive environment'. Together they form a unique fingerprint.

  • Cite this

    Kim, K., Jeon, C., Park, J., Jeong, S., Han, D. K., & Ko, H. (2009). Design of audio-visual interface for aiding driver's voice commands in automotive environment. In In-Vehicle Corpus and Signal Processing for Driver Behavior (pp. 211-219). Springer US. https://doi.org/10.1007/978-0-387-79582-9_17