Space-time voice activity detection

Hyeopwoo Lee, Dongsuk Yook

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

When speech-based interfaces are used for small handheld devices such as cellular phones and personal digital assistants in mobile environments with unknown noises and surrounding talkers, all signals except the legitimate user's voice must be rejected as noise signals by the system. This paper proposes a new algorithm that detects the user's voice in spatial and temporal domains using directional and spectral information. It rejects undesirable signals that originate from noise sources or surrounding talkers. Experimental results indicate the proposed algorithm reduces the voice activity detection error rate by 34.3% relative to the conventional methods.

Original languageEnglish
Pages (from-to)1471-1476
Number of pages6
JournalIEEE Transactions on Consumer Electronics
Volume55
Issue number3
DOIs
Publication statusPublished - 2009 Oct 29

Fingerprint

Personal digital assistants
Error detection

Keywords

  • Microphone array
  • Sound source localization
  • Voice activity detection

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Media Technology

Cite this

Space-time voice activity detection. / Lee, Hyeopwoo; Yook, Dongsuk.

In: IEEE Transactions on Consumer Electronics, Vol. 55, No. 3, 29.10.2009, p. 1471-1476.

Research output: Contribution to journalArticle

Lee, Hyeopwoo ; Yook, Dongsuk. / Space-time voice activity detection. In: IEEE Transactions on Consumer Electronics. 2009 ; Vol. 55, No. 3. pp. 1471-1476.
@article{e1e1e891f5ac42c5ac88b5048bc15660,
title = "Space-time voice activity detection",
abstract = "When speech-based interfaces are used for small handheld devices such as cellular phones and personal digital assistants in mobile environments with unknown noises and surrounding talkers, all signals except the legitimate user's voice must be rejected as noise signals by the system. This paper proposes a new algorithm that detects the user's voice in spatial and temporal domains using directional and spectral information. It rejects undesirable signals that originate from noise sources or surrounding talkers. Experimental results indicate the proposed algorithm reduces the voice activity detection error rate by 34.3{\%} relative to the conventional methods.",
keywords = "Microphone array, Sound source localization, Voice activity detection",
author = "Hyeopwoo Lee and Dongsuk Yook",
year = "2009",
month = "10",
day = "29",
doi = "10.1109/TCE.2009.5278015",
language = "English",
volume = "55",
pages = "1471--1476",
journal = "IEEE Transactions on Consumer Electronics",
issn = "0098-3063",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - Space-time voice activity detection

AU - Lee, Hyeopwoo

AU - Yook, Dongsuk

PY - 2009/10/29

Y1 - 2009/10/29

N2 - When speech-based interfaces are used for small handheld devices such as cellular phones and personal digital assistants in mobile environments with unknown noises and surrounding talkers, all signals except the legitimate user's voice must be rejected as noise signals by the system. This paper proposes a new algorithm that detects the user's voice in spatial and temporal domains using directional and spectral information. It rejects undesirable signals that originate from noise sources or surrounding talkers. Experimental results indicate the proposed algorithm reduces the voice activity detection error rate by 34.3% relative to the conventional methods.

AB - When speech-based interfaces are used for small handheld devices such as cellular phones and personal digital assistants in mobile environments with unknown noises and surrounding talkers, all signals except the legitimate user's voice must be rejected as noise signals by the system. This paper proposes a new algorithm that detects the user's voice in spatial and temporal domains using directional and spectral information. It rejects undesirable signals that originate from noise sources or surrounding talkers. Experimental results indicate the proposed algorithm reduces the voice activity detection error rate by 34.3% relative to the conventional methods.

KW - Microphone array

KW - Sound source localization

KW - Voice activity detection

UR - http://www.scopus.com/inward/record.url?scp=70350297703&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350297703&partnerID=8YFLogxK

U2 - 10.1109/TCE.2009.5278015

DO - 10.1109/TCE.2009.5278015

M3 - Article

VL - 55

SP - 1471

EP - 1476

JO - IEEE Transactions on Consumer Electronics

JF - IEEE Transactions on Consumer Electronics

SN - 0098-3063

IS - 3

ER -