Autoencoder based domain adaptation for speaker recognition under insufficient channel information

Suwon Shon, Seongkyu Mun, Wooil Kim, Hanseok Ko

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

In real-life conditions, mismatch between development and test domain degrades speaker recognition performance. To solve the issue, many researchers explored domain adaptation approaches using matched in-domain dataset. However, adaptation would be not effective if the dataset is insufficient to estimate channel variability of the domain. In this paper, we explore the problem of performance degradation under such a situation of insufficient channel information. In order to exploit limited in-domain dataset effectively, we propose an unsupervised domain adaptation approach using Autoencoder based Domain Adaptation (AEDA). The proposed approach combines an autoencoder with a denoising autoencoder to adapt resource-rich development dataset to test domain. The proposed technique is evaluated on the Domain Adaptation Challenge 13 experimental protocols that is widely used in speaker recognition for domain mismatched condition. The results show significant improvements over baselines and results from other prior studies.

Original languageEnglish
Pages (from-to)1014-1018
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2017-August
DOIs
Publication statusPublished - 2017 Jan 1

Fingerprint

Speaker Recognition
Degradation
Denoising
Baseline

Keywords

  • Autoencoder
  • Denoising autoencoder
  • Domain mismatch
  • Speaker recognition
  • Unsupervised domain adaptation

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Autoencoder based domain adaptation for speaker recognition under insufficient channel information. / Shon, Suwon; Mun, Seongkyu; Kim, Wooil; Ko, Hanseok.

In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2017-August, 01.01.2017, p. 1014-1018.

Research output: Contribution to journalArticle

@article{a51354c03c0545838f056ed5b62d30e9,
title = "Autoencoder based domain adaptation for speaker recognition under insufficient channel information",
abstract = "In real-life conditions, mismatch between development and test domain degrades speaker recognition performance. To solve the issue, many researchers explored domain adaptation approaches using matched in-domain dataset. However, adaptation would be not effective if the dataset is insufficient to estimate channel variability of the domain. In this paper, we explore the problem of performance degradation under such a situation of insufficient channel information. In order to exploit limited in-domain dataset effectively, we propose an unsupervised domain adaptation approach using Autoencoder based Domain Adaptation (AEDA). The proposed approach combines an autoencoder with a denoising autoencoder to adapt resource-rich development dataset to test domain. The proposed technique is evaluated on the Domain Adaptation Challenge 13 experimental protocols that is widely used in speaker recognition for domain mismatched condition. The results show significant improvements over baselines and results from other prior studies.",
keywords = "Autoencoder, Denoising autoencoder, Domain mismatch, Speaker recognition, Unsupervised domain adaptation",
author = "Suwon Shon and Seongkyu Mun and Wooil Kim and Hanseok Ko",
year = "2017",
month = "1",
day = "1",
doi = "10.21437/Interspeech.2017-49",
language = "English",
volume = "2017-August",
pages = "1014--1018",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

TY - JOUR

T1 - Autoencoder based domain adaptation for speaker recognition under insufficient channel information

AU - Shon, Suwon

AU - Mun, Seongkyu

AU - Kim, Wooil

AU - Ko, Hanseok

PY - 2017/1/1

Y1 - 2017/1/1

N2 - In real-life conditions, mismatch between development and test domain degrades speaker recognition performance. To solve the issue, many researchers explored domain adaptation approaches using matched in-domain dataset. However, adaptation would be not effective if the dataset is insufficient to estimate channel variability of the domain. In this paper, we explore the problem of performance degradation under such a situation of insufficient channel information. In order to exploit limited in-domain dataset effectively, we propose an unsupervised domain adaptation approach using Autoencoder based Domain Adaptation (AEDA). The proposed approach combines an autoencoder with a denoising autoencoder to adapt resource-rich development dataset to test domain. The proposed technique is evaluated on the Domain Adaptation Challenge 13 experimental protocols that is widely used in speaker recognition for domain mismatched condition. The results show significant improvements over baselines and results from other prior studies.

AB - In real-life conditions, mismatch between development and test domain degrades speaker recognition performance. To solve the issue, many researchers explored domain adaptation approaches using matched in-domain dataset. However, adaptation would be not effective if the dataset is insufficient to estimate channel variability of the domain. In this paper, we explore the problem of performance degradation under such a situation of insufficient channel information. In order to exploit limited in-domain dataset effectively, we propose an unsupervised domain adaptation approach using Autoencoder based Domain Adaptation (AEDA). The proposed approach combines an autoencoder with a denoising autoencoder to adapt resource-rich development dataset to test domain. The proposed technique is evaluated on the Domain Adaptation Challenge 13 experimental protocols that is widely used in speaker recognition for domain mismatched condition. The results show significant improvements over baselines and results from other prior studies.

KW - Autoencoder

KW - Denoising autoencoder

KW - Domain mismatch

KW - Speaker recognition

KW - Unsupervised domain adaptation

UR - http://www.scopus.com/inward/record.url?scp=85039154266&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85039154266&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2017-49

DO - 10.21437/Interspeech.2017-49

M3 - Article

VL - 2017-August

SP - 1014

EP - 1018

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -