Effective acoustic model clustering via decision-tree with supervised learning

Junho Park, Hanseok Ko

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

In large vocabulary speech recognition, context-dependent modeling is essential for improving both accuracy and speed. To cope with the sparse data problem that arises from the proliferation of context-dependent models, two kinds of clustering methods, data-driven and rule-based, have been vigorously investigated. The inherent difficulty of applying data-driven approaches to unknown contexts has motivated the development of better rule-based clustering methods. This paper develops a hybrid approach that essentially constructs a supervised decision rule which operates on pre-clustered triphones. This scheme employs the C45 decision-tree learning algorithm to extract the attributes that best support clustering of training data. In particular, the data-driven method is used as a clustering algorithm, while its result is used as the learning target of the C45 algorithm. The proposed scheme provides an effective solution to the clustering error problem arising from unsupervised decision-tree learning and also renders successful clustering of the multiple mixture Gaussian state distributions. In speaker-independent, task-independent continuous speech recognition, the proposed method reduced the relative WER by 3.93%.

Original languageEnglish
Pages (from-to)1-13
Number of pages13
JournalSpeech Communication
Volume46
Issue number1
DOIs
Publication statusPublished - 2005 May

Keywords

  • Acoustic modeling
  • Decision-tree
  • Large vocabulary continuous speech recognition

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Effective acoustic model clustering via decision-tree with supervised learning'. Together they form a unique fingerprint.

Cite this