Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition

Suwon Shon, Seongkyu Mun, David K. Han, Hanseok Ko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper analyzes heteroscedasticity in i-vector for robust forensics and surveillance speaker recognition system. Linear Discriminant Analysis (LDA), a widely-used linear dimension reduction technique, assumes that classes are homoscedastic within a same covariance. In this paper it is assumed that general speech utterances contain both homoscedastic and heteroscedastic elements. We show the validity of this assumption by employing several analyses and also demonstrate that dimension reduction using principal components is feasible. To effectively handle the presence of heteroscedastic and homoscedastic elements, we propose a fusion approach of applying both LDA and Heteroscedastic-LDA (HLDA). The experiments are conducted to show its effectiveness and compare to other methods using the telephone database of National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) 2010 extended.

Original languageEnglish
Title of host publicationAVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Print)9781467376327
DOIs
Publication statusPublished - 2015 Oct 19
Event12th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2015 - Karlsruhe, Germany
Duration: 2015 Aug 252015 Aug 28

Other

Other12th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2015
CountryGermany
CityKarlsruhe
Period15/8/2515/8/28

Fingerprint

discriminant analysis
Discriminant analysis
Maximum likelihood
Telephone
telephone
surveillance
Fusion reactions
experiment
evaluation
Experiments

Keywords

  • Algorithm design and analysis
  • Analytical models
  • Computational modeling
  • Speech
  • Speech processing
  • Switches
  • Transforms

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Electrical and Electronic Engineering
  • Communication

Cite this

Shon, S., Mun, S., Han, D. K., & Ko, H. (2015). Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition. In AVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance [7301784] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AVSS.2015.7301784

Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition. / Shon, Suwon; Mun, Seongkyu; Han, David K.; Ko, Hanseok.

AVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance. Institute of Electrical and Electronics Engineers Inc., 2015. 7301784.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shon, S, Mun, S, Han, DK & Ko, H 2015, Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition. in AVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance., 7301784, Institute of Electrical and Electronics Engineers Inc., 12th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2015, Karlsruhe, Germany, 15/8/25. https://doi.org/10.1109/AVSS.2015.7301784
Shon S, Mun S, Han DK, Ko H. Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition. In AVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance. Institute of Electrical and Electronics Engineers Inc. 2015. 7301784 https://doi.org/10.1109/AVSS.2015.7301784
Shon, Suwon ; Mun, Seongkyu ; Han, David K. ; Ko, Hanseok. / Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition. AVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance. Institute of Electrical and Electronics Engineers Inc., 2015.
@inproceedings{7a6ce0abe7154a309c5a4ec557c756ce,
title = "Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition",
abstract = "This paper analyzes heteroscedasticity in i-vector for robust forensics and surveillance speaker recognition system. Linear Discriminant Analysis (LDA), a widely-used linear dimension reduction technique, assumes that classes are homoscedastic within a same covariance. In this paper it is assumed that general speech utterances contain both homoscedastic and heteroscedastic elements. We show the validity of this assumption by employing several analyses and also demonstrate that dimension reduction using principal components is feasible. To effectively handle the presence of heteroscedastic and homoscedastic elements, we propose a fusion approach of applying both LDA and Heteroscedastic-LDA (HLDA). The experiments are conducted to show its effectiveness and compare to other methods using the telephone database of National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) 2010 extended.",
keywords = "Algorithm design and analysis, Analytical models, Computational modeling, Speech, Speech processing, Switches, Transforms",
author = "Suwon Shon and Seongkyu Mun and Han, {David K.} and Hanseok Ko",
year = "2015",
month = "10",
day = "19",
doi = "10.1109/AVSS.2015.7301784",
language = "English",
isbn = "9781467376327",
booktitle = "AVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition

AU - Shon, Suwon

AU - Mun, Seongkyu

AU - Han, David K.

AU - Ko, Hanseok

PY - 2015/10/19

Y1 - 2015/10/19

N2 - This paper analyzes heteroscedasticity in i-vector for robust forensics and surveillance speaker recognition system. Linear Discriminant Analysis (LDA), a widely-used linear dimension reduction technique, assumes that classes are homoscedastic within a same covariance. In this paper it is assumed that general speech utterances contain both homoscedastic and heteroscedastic elements. We show the validity of this assumption by employing several analyses and also demonstrate that dimension reduction using principal components is feasible. To effectively handle the presence of heteroscedastic and homoscedastic elements, we propose a fusion approach of applying both LDA and Heteroscedastic-LDA (HLDA). The experiments are conducted to show its effectiveness and compare to other methods using the telephone database of National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) 2010 extended.

AB - This paper analyzes heteroscedasticity in i-vector for robust forensics and surveillance speaker recognition system. Linear Discriminant Analysis (LDA), a widely-used linear dimension reduction technique, assumes that classes are homoscedastic within a same covariance. In this paper it is assumed that general speech utterances contain both homoscedastic and heteroscedastic elements. We show the validity of this assumption by employing several analyses and also demonstrate that dimension reduction using principal components is feasible. To effectively handle the presence of heteroscedastic and homoscedastic elements, we propose a fusion approach of applying both LDA and Heteroscedastic-LDA (HLDA). The experiments are conducted to show its effectiveness and compare to other methods using the telephone database of National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) 2010 extended.

KW - Algorithm design and analysis

KW - Analytical models

KW - Computational modeling

KW - Speech

KW - Speech processing

KW - Switches

KW - Transforms

UR - http://www.scopus.com/inward/record.url?scp=84958625592&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958625592&partnerID=8YFLogxK

U2 - 10.1109/AVSS.2015.7301784

DO - 10.1109/AVSS.2015.7301784

M3 - Conference contribution

SN - 9781467376327

BT - AVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance

PB - Institute of Electrical and Electronics Engineers Inc.

ER -