Deep neural network bottleneck features for acoustic event recognition

Seongkyu Mun, Suwon Shon, Wooil Kim, Hanseok Ko

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Bottleneck features have been shown to be effective in improving the accuracy of speaker recognition, language identification and automatic speech recognition. However, few works have focused on bottleneck features for acoustic event recognition. This paper proposes a novel acoustic event recognition framework using bottleneck features derived from a Deep Neural Network (DNN). In addition to conventional features (MFCC, Mel-spectrum, etc.), this paper employs rhythm, timbre, and spectrum-statistics features for effectively extracting acoustic characteristics from audio signals. The effectiveness of the proposed method is demonstrated on a database of real life recordings via experiments, and its robust performance is verified by comparing to conventional methods.

Original languageEnglish
Pages (from-to)2954-2957
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume08-12-September-2016
DOIs
Publication statusPublished - 2016

Fingerprint

Acoustics
Neural Networks
Language Identification
Speaker Recognition
Automatic Speech Recognition
Robust Performance
Speech recognition
Statistics
Experiment
Deep neural networks
Conventional
Experiments
Rhythm
Language Recognition
Data Base
Real Life
Timbre
Acoustic Characteristics
Framework
Life

Keywords

  • Acoustic event recognition
  • Bottleneck feature
  • Deep belief network
  • Deep neural network
  • Feature extraction

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Deep neural network bottleneck features for acoustic event recognition. / Mun, Seongkyu; Shon, Suwon; Kim, Wooil; Ko, Hanseok.

In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 08-12-September-2016, 2016, p. 2954-2957.

Research output: Contribution to journalArticle

@article{441e9d50b2414fd9b768a3d0a482e94a,
title = "Deep neural network bottleneck features for acoustic event recognition",
abstract = "Bottleneck features have been shown to be effective in improving the accuracy of speaker recognition, language identification and automatic speech recognition. However, few works have focused on bottleneck features for acoustic event recognition. This paper proposes a novel acoustic event recognition framework using bottleneck features derived from a Deep Neural Network (DNN). In addition to conventional features (MFCC, Mel-spectrum, etc.), this paper employs rhythm, timbre, and spectrum-statistics features for effectively extracting acoustic characteristics from audio signals. The effectiveness of the proposed method is demonstrated on a database of real life recordings via experiments, and its robust performance is verified by comparing to conventional methods.",
keywords = "Acoustic event recognition, Bottleneck feature, Deep belief network, Deep neural network, Feature extraction",
author = "Seongkyu Mun and Suwon Shon and Wooil Kim and Hanseok Ko",
year = "2016",
doi = "10.21437/Interspeech.2016-1112",
language = "English",
volume = "08-12-September-2016",
pages = "2954--2957",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

TY - JOUR

T1 - Deep neural network bottleneck features for acoustic event recognition

AU - Mun, Seongkyu

AU - Shon, Suwon

AU - Kim, Wooil

AU - Ko, Hanseok

PY - 2016

Y1 - 2016

N2 - Bottleneck features have been shown to be effective in improving the accuracy of speaker recognition, language identification and automatic speech recognition. However, few works have focused on bottleneck features for acoustic event recognition. This paper proposes a novel acoustic event recognition framework using bottleneck features derived from a Deep Neural Network (DNN). In addition to conventional features (MFCC, Mel-spectrum, etc.), this paper employs rhythm, timbre, and spectrum-statistics features for effectively extracting acoustic characteristics from audio signals. The effectiveness of the proposed method is demonstrated on a database of real life recordings via experiments, and its robust performance is verified by comparing to conventional methods.

AB - Bottleneck features have been shown to be effective in improving the accuracy of speaker recognition, language identification and automatic speech recognition. However, few works have focused on bottleneck features for acoustic event recognition. This paper proposes a novel acoustic event recognition framework using bottleneck features derived from a Deep Neural Network (DNN). In addition to conventional features (MFCC, Mel-spectrum, etc.), this paper employs rhythm, timbre, and spectrum-statistics features for effectively extracting acoustic characteristics from audio signals. The effectiveness of the proposed method is demonstrated on a database of real life recordings via experiments, and its robust performance is verified by comparing to conventional methods.

KW - Acoustic event recognition

KW - Bottleneck feature

KW - Deep belief network

KW - Deep neural network

KW - Feature extraction

UR - http://www.scopus.com/inward/record.url?scp=84994212686&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994212686&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2016-1112

DO - 10.21437/Interspeech.2016-1112

M3 - Article

AN - SCOPUS:84994212686

VL - 08-12-September-2016

SP - 2954

EP - 2957

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -