Environment-independent mask estimation for missing-feature reconstruction

Wooil Kim, Richard M. Stern, Hanseok Ko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

In this paper, we propose an effective mask-estimation method for missing-feature reconstruction in order to achieve robust speech recognition in unknown noise environments. In previous work, it was found that training a model for mask estimation on speech corrupted by white noise did not provide environment-independent recognition accuracy. In this paper we describe a training method based on bands of colored noise that is more effective in reflecting spectral variations across neighboring frames and subbands. We also achieved further improvement in recognition accuracy by reconsidering frames that appeared to be unvoiced in the initial pitch analysis. Performance is evaluated using the Aurora 2.0 database in the presence of various types of noise maskers. Experimental results indicate that the proposed methods are effective in estimating masks for missing-feature reconstruction while remaining more independent of the noise conditions.

Original languageEnglish
Title of host publication9th European Conference on Speech Communication and Technology
Pages2637-2640
Number of pages4
Publication statusPublished - 2005 Dec 1
Event9th European Conference on Speech Communication and Technology - Lisbon, Portugal
Duration: 2005 Sep 42005 Sep 8

Other

Other9th European Conference on Speech Communication and Technology
CountryPortugal
CityLisbon
Period05/9/405/9/8

Fingerprint

Masks
White noise
Speech recognition
Acoustic noise

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Kim, W., Stern, R. M., & Ko, H. (2005). Environment-independent mask estimation for missing-feature reconstruction. In 9th European Conference on Speech Communication and Technology (pp. 2637-2640)

Environment-independent mask estimation for missing-feature reconstruction. / Kim, Wooil; Stern, Richard M.; Ko, Hanseok.

9th European Conference on Speech Communication and Technology. 2005. p. 2637-2640.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, W, Stern, RM & Ko, H 2005, Environment-independent mask estimation for missing-feature reconstruction. in 9th European Conference on Speech Communication and Technology. pp. 2637-2640, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, 05/9/4.
Kim W, Stern RM, Ko H. Environment-independent mask estimation for missing-feature reconstruction. In 9th European Conference on Speech Communication and Technology. 2005. p. 2637-2640
Kim, Wooil ; Stern, Richard M. ; Ko, Hanseok. / Environment-independent mask estimation for missing-feature reconstruction. 9th European Conference on Speech Communication and Technology. 2005. pp. 2637-2640
@inproceedings{5ddb41d8bcff40188f40aa2936857737,
title = "Environment-independent mask estimation for missing-feature reconstruction",
abstract = "In this paper, we propose an effective mask-estimation method for missing-feature reconstruction in order to achieve robust speech recognition in unknown noise environments. In previous work, it was found that training a model for mask estimation on speech corrupted by white noise did not provide environment-independent recognition accuracy. In this paper we describe a training method based on bands of colored noise that is more effective in reflecting spectral variations across neighboring frames and subbands. We also achieved further improvement in recognition accuracy by reconsidering frames that appeared to be unvoiced in the initial pitch analysis. Performance is evaluated using the Aurora 2.0 database in the presence of various types of noise maskers. Experimental results indicate that the proposed methods are effective in estimating masks for missing-feature reconstruction while remaining more independent of the noise conditions.",
author = "Wooil Kim and Stern, {Richard M.} and Hanseok Ko",
year = "2005",
month = "12",
day = "1",
language = "English",
pages = "2637--2640",
booktitle = "9th European Conference on Speech Communication and Technology",

}

TY - GEN

T1 - Environment-independent mask estimation for missing-feature reconstruction

AU - Kim, Wooil

AU - Stern, Richard M.

AU - Ko, Hanseok

PY - 2005/12/1

Y1 - 2005/12/1

N2 - In this paper, we propose an effective mask-estimation method for missing-feature reconstruction in order to achieve robust speech recognition in unknown noise environments. In previous work, it was found that training a model for mask estimation on speech corrupted by white noise did not provide environment-independent recognition accuracy. In this paper we describe a training method based on bands of colored noise that is more effective in reflecting spectral variations across neighboring frames and subbands. We also achieved further improvement in recognition accuracy by reconsidering frames that appeared to be unvoiced in the initial pitch analysis. Performance is evaluated using the Aurora 2.0 database in the presence of various types of noise maskers. Experimental results indicate that the proposed methods are effective in estimating masks for missing-feature reconstruction while remaining more independent of the noise conditions.

AB - In this paper, we propose an effective mask-estimation method for missing-feature reconstruction in order to achieve robust speech recognition in unknown noise environments. In previous work, it was found that training a model for mask estimation on speech corrupted by white noise did not provide environment-independent recognition accuracy. In this paper we describe a training method based on bands of colored noise that is more effective in reflecting spectral variations across neighboring frames and subbands. We also achieved further improvement in recognition accuracy by reconsidering frames that appeared to be unvoiced in the initial pitch analysis. Performance is evaluated using the Aurora 2.0 database in the presence of various types of noise maskers. Experimental results indicate that the proposed methods are effective in estimating masks for missing-feature reconstruction while remaining more independent of the noise conditions.

UR - http://www.scopus.com/inward/record.url?scp=33745200501&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33745200501&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33745200501

SP - 2637

EP - 2640

BT - 9th European Conference on Speech Communication and Technology

ER -