A time delay convolutional neural network for acoustic scene classification

Younglo Lee, Sangwook Park, Hanseok Ko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, demands for more natural interaction between human and machine through speech have been increasing. In order to accomplish this mission, it becomes more significant for machine to understand the human's contextual status. This paper proposes a novel neural network framework that can be applied to commercial smart devices with microphones to recognize acoustic contextual information. Our approach takes into consideration the fact that an acoustic signal has more local connectivity on the time axis than the frequency axis. Experimental results show that the proposed method outperforms two conventional approaches, which are Gaussian Mixture Models (GMMs) and Multi-Layer Perceptron (MLP), by 8.6% and 7.8% respectively in overall accuracy.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Consumer Electronics, ICCE 2018
EditorsSaraju P. Mohanty, Hai Li, Peter Corcoran, Jong-Hyouk Lee, Anirban Sengupta
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-3
Number of pages3
Volume2018-January
ISBN (Electronic)9781538630259
DOIs
Publication statusPublished - 2018 Mar 26
Event2018 IEEE International Conference on Consumer Electronics, ICCE 2018 - Las Vegas, United States
Duration: 2018 Jan 122018 Jan 14

Other

Other2018 IEEE International Conference on Consumer Electronics, ICCE 2018
CountryUnited States
CityLas Vegas
Period18/1/1218/1/14

Fingerprint

Time delay
Acoustics
Neural networks
Multilayer neural networks
Microphones

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Electrical and Electronic Engineering
  • Media Technology

Cite this

Lee, Y., Park, S., & Ko, H. (2018). A time delay convolutional neural network for acoustic scene classification. In S. P. Mohanty, H. Li, P. Corcoran, J-H. Lee, & A. Sengupta (Eds.), 2018 IEEE International Conference on Consumer Electronics, ICCE 2018 (Vol. 2018-January, pp. 1-3). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCE.2018.8326082

A time delay convolutional neural network for acoustic scene classification. / Lee, Younglo; Park, Sangwook; Ko, Hanseok.

2018 IEEE International Conference on Consumer Electronics, ICCE 2018. ed. / Saraju P. Mohanty; Hai Li; Peter Corcoran; Jong-Hyouk Lee; Anirban Sengupta. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. p. 1-3.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, Y, Park, S & Ko, H 2018, A time delay convolutional neural network for acoustic scene classification. in SP Mohanty, H Li, P Corcoran, J-H Lee & A Sengupta (eds), 2018 IEEE International Conference on Consumer Electronics, ICCE 2018. vol. 2018-January, Institute of Electrical and Electronics Engineers Inc., pp. 1-3, 2018 IEEE International Conference on Consumer Electronics, ICCE 2018, Las Vegas, United States, 18/1/12. https://doi.org/10.1109/ICCE.2018.8326082
Lee Y, Park S, Ko H. A time delay convolutional neural network for acoustic scene classification. In Mohanty SP, Li H, Corcoran P, Lee J-H, Sengupta A, editors, 2018 IEEE International Conference on Consumer Electronics, ICCE 2018. Vol. 2018-January. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1-3 https://doi.org/10.1109/ICCE.2018.8326082
Lee, Younglo ; Park, Sangwook ; Ko, Hanseok. / A time delay convolutional neural network for acoustic scene classification. 2018 IEEE International Conference on Consumer Electronics, ICCE 2018. editor / Saraju P. Mohanty ; Hai Li ; Peter Corcoran ; Jong-Hyouk Lee ; Anirban Sengupta. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1-3
@inproceedings{c32a1e1daf6e4f078c490f840bd7ac68,
title = "A time delay convolutional neural network for acoustic scene classification",
abstract = "In recent years, demands for more natural interaction between human and machine through speech have been increasing. In order to accomplish this mission, it becomes more significant for machine to understand the human's contextual status. This paper proposes a novel neural network framework that can be applied to commercial smart devices with microphones to recognize acoustic contextual information. Our approach takes into consideration the fact that an acoustic signal has more local connectivity on the time axis than the frequency axis. Experimental results show that the proposed method outperforms two conventional approaches, which are Gaussian Mixture Models (GMMs) and Multi-Layer Perceptron (MLP), by 8.6{\%} and 7.8{\%} respectively in overall accuracy.",
author = "Younglo Lee and Sangwook Park and Hanseok Ko",
year = "2018",
month = "3",
day = "26",
doi = "10.1109/ICCE.2018.8326082",
language = "English",
volume = "2018-January",
pages = "1--3",
editor = "Mohanty, {Saraju P.} and Hai Li and Peter Corcoran and Jong-Hyouk Lee and Anirban Sengupta",
booktitle = "2018 IEEE International Conference on Consumer Electronics, ICCE 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - A time delay convolutional neural network for acoustic scene classification

AU - Lee, Younglo

AU - Park, Sangwook

AU - Ko, Hanseok

PY - 2018/3/26

Y1 - 2018/3/26

N2 - In recent years, demands for more natural interaction between human and machine through speech have been increasing. In order to accomplish this mission, it becomes more significant for machine to understand the human's contextual status. This paper proposes a novel neural network framework that can be applied to commercial smart devices with microphones to recognize acoustic contextual information. Our approach takes into consideration the fact that an acoustic signal has more local connectivity on the time axis than the frequency axis. Experimental results show that the proposed method outperforms two conventional approaches, which are Gaussian Mixture Models (GMMs) and Multi-Layer Perceptron (MLP), by 8.6% and 7.8% respectively in overall accuracy.

AB - In recent years, demands for more natural interaction between human and machine through speech have been increasing. In order to accomplish this mission, it becomes more significant for machine to understand the human's contextual status. This paper proposes a novel neural network framework that can be applied to commercial smart devices with microphones to recognize acoustic contextual information. Our approach takes into consideration the fact that an acoustic signal has more local connectivity on the time axis than the frequency axis. Experimental results show that the proposed method outperforms two conventional approaches, which are Gaussian Mixture Models (GMMs) and Multi-Layer Perceptron (MLP), by 8.6% and 7.8% respectively in overall accuracy.

UR - http://www.scopus.com/inward/record.url?scp=85048864121&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048864121&partnerID=8YFLogxK

U2 - 10.1109/ICCE.2018.8326082

DO - 10.1109/ICCE.2018.8326082

M3 - Conference contribution

AN - SCOPUS:85048864121

VL - 2018-January

SP - 1

EP - 3

BT - 2018 IEEE International Conference on Consumer Electronics, ICCE 2018

A2 - Mohanty, Saraju P.

A2 - Li, Hai

A2 - Corcoran, Peter

A2 - Lee, Jong-Hyouk

A2 - Sengupta, Anirban

PB - Institute of Electrical and Electronics Engineers Inc.

ER -