Achieving real-time lip-synch via SVM-based phoneme classification and lip shape refinement

Taeyoon Kim, Yongsung Kang, Hanseok Ko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we develop a real time lip-synch system that activates a 2D avatar's lip motion in synch with incoming speech utterance. To realize "real time" operation of the system, we contain the processing time by invoking a merge and split procedure performing coarse-to-fine phoneme classification. At each stage of phoneme classification, we apply a support vector machine (SVM) to constrain the computational load while attaining desirable accuracy. Coarse-to-fine phoneme classification is accomplished via 2 stages of feature extraction, where each speech frame is acoustically analyzed first for 3 classes of lip opening using MFCC as the feature and then a further refined classification for detailed lip shape using formant information. We implemented the system with 2D lip animation that shows the effectiveness of the proposed 2-stage procedure accomplishing the real-time lip-synch task.

Original languageEnglish
Title of host publicationProceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages299-304
Number of pages6
ISBN (Print)0769518346, 9780769518343
DOIs
Publication statusPublished - 2002
Event4th IEEE International Conference on Multimodal Interfaces, ICMI 2002 - Pittsburgh, United States
Duration: 2002 Oct 142002 Oct 16

Other

Other4th IEEE International Conference on Multimodal Interfaces, ICMI 2002
CountryUnited States
CityPittsburgh
Period02/10/1402/10/16

    Fingerprint

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture

Cite this

Kim, T., Kang, Y., & Ko, H. (2002). Achieving real-time lip-synch via SVM-based phoneme classification and lip shape refinement. In Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002 (pp. 299-304). [1167010] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICMI.2002.1167010