Linear spectral transformation for robust speech recognition using maximum mutual information

Donghyun Kim, Dongsuk Yook

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

This paper presents a transformation-based rapid adaptation technique for robust speech recognition using a linear spectral transformation (LST) and a maximum mutual information (MMI) criterion. Previously, a maximum likelihood linear spectral transformation (ML-LST) algorithm was proposed for fast adaptation in unknown environments. Since the MMI estimation method does not require evenly distributed training data and increases the a posteriori probability of the word sequences of the training data, we combine the linear spectral transformation method and the MMI estimation technique in order to achieve extremely rapid adaptation using only one word of adaptation data. The proposed algorithm, called MMI-LST, was implemented using the extended Baum-Welch algorithm and phonetic lattices, and evaluated on the TIMIT and FFMTIMIT corpora. It provides a relative reducion in the speech recognition error rate of 11.1% using only 0.25 s of adaptation data.

Original languageEnglish
Pages (from-to)496-499
Number of pages4
JournalIEEE Signal Processing Letters
Volume14
Issue number7
DOIs
Publication statusPublished - 2007 Jul 1

Fingerprint

Robust Speech Recognition
Mutual Information
Speech recognition
Speech analysis
Maximum likelihood
Information Criterion
Speech Recognition
Maximum Likelihood
Error Rate
Unknown

Keywords

  • Linear spectral transformation
  • Maximum mutual information (MMI)
  • Rapid adaptation
  • Robust speech recognition

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing

Cite this

Linear spectral transformation for robust speech recognition using maximum mutual information. / Kim, Donghyun; Yook, Dongsuk.

In: IEEE Signal Processing Letters, Vol. 14, No. 7, 01.07.2007, p. 496-499.

Research output: Contribution to journalArticle

@article{cca37a5157e646da8c79db987c7ac0b0,
title = "Linear spectral transformation for robust speech recognition using maximum mutual information",
abstract = "This paper presents a transformation-based rapid adaptation technique for robust speech recognition using a linear spectral transformation (LST) and a maximum mutual information (MMI) criterion. Previously, a maximum likelihood linear spectral transformation (ML-LST) algorithm was proposed for fast adaptation in unknown environments. Since the MMI estimation method does not require evenly distributed training data and increases the a posteriori probability of the word sequences of the training data, we combine the linear spectral transformation method and the MMI estimation technique in order to achieve extremely rapid adaptation using only one word of adaptation data. The proposed algorithm, called MMI-LST, was implemented using the extended Baum-Welch algorithm and phonetic lattices, and evaluated on the TIMIT and FFMTIMIT corpora. It provides a relative reducion in the speech recognition error rate of 11.1{\%} using only 0.25 s of adaptation data.",
keywords = "Linear spectral transformation, Maximum mutual information (MMI), Rapid adaptation, Robust speech recognition",
author = "Donghyun Kim and Dongsuk Yook",
year = "2007",
month = "7",
day = "1",
doi = "10.1109/LSP.2006.891337",
language = "English",
volume = "14",
pages = "496--499",
journal = "IEEE Signal Processing Letters",
issn = "1070-9908",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "7",

}

TY - JOUR

T1 - Linear spectral transformation for robust speech recognition using maximum mutual information

AU - Kim, Donghyun

AU - Yook, Dongsuk

PY - 2007/7/1

Y1 - 2007/7/1

N2 - This paper presents a transformation-based rapid adaptation technique for robust speech recognition using a linear spectral transformation (LST) and a maximum mutual information (MMI) criterion. Previously, a maximum likelihood linear spectral transformation (ML-LST) algorithm was proposed for fast adaptation in unknown environments. Since the MMI estimation method does not require evenly distributed training data and increases the a posteriori probability of the word sequences of the training data, we combine the linear spectral transformation method and the MMI estimation technique in order to achieve extremely rapid adaptation using only one word of adaptation data. The proposed algorithm, called MMI-LST, was implemented using the extended Baum-Welch algorithm and phonetic lattices, and evaluated on the TIMIT and FFMTIMIT corpora. It provides a relative reducion in the speech recognition error rate of 11.1% using only 0.25 s of adaptation data.

AB - This paper presents a transformation-based rapid adaptation technique for robust speech recognition using a linear spectral transformation (LST) and a maximum mutual information (MMI) criterion. Previously, a maximum likelihood linear spectral transformation (ML-LST) algorithm was proposed for fast adaptation in unknown environments. Since the MMI estimation method does not require evenly distributed training data and increases the a posteriori probability of the word sequences of the training data, we combine the linear spectral transformation method and the MMI estimation technique in order to achieve extremely rapid adaptation using only one word of adaptation data. The proposed algorithm, called MMI-LST, was implemented using the extended Baum-Welch algorithm and phonetic lattices, and evaluated on the TIMIT and FFMTIMIT corpora. It provides a relative reducion in the speech recognition error rate of 11.1% using only 0.25 s of adaptation data.

KW - Linear spectral transformation

KW - Maximum mutual information (MMI)

KW - Rapid adaptation

KW - Robust speech recognition

UR - http://www.scopus.com/inward/record.url?scp=34347395939&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34347395939&partnerID=8YFLogxK

U2 - 10.1109/LSP.2006.891337

DO - 10.1109/LSP.2006.891337

M3 - Article

VL - 14

SP - 496

EP - 499

JO - IEEE Signal Processing Letters

JF - IEEE Signal Processing Letters

SN - 1070-9908

IS - 7

ER -