Temporal attention based animal sound classification

Jungmin Kim, Younglo Lee, Donghyeon Kim, Hanseok Ko

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, to improve the classification accuracy of bird and amphibian acoustic sound, we utilize GLU (Gated Linear Unit) and Self-attention that encourages the network to extract important features from data and discriminate relevant important frames from all the input sequences for further performance improvement. To utilize acoustic data, we convert 1-D acoustic data to a log-Mel spectrogram. Subsequently, undesirable component such as background noise in the log-Mel spectrogram is reduced by GLU. Then, we employ the proposed temporal self-attention to improve classification accuracy. The data consist of 6-species of birds, 8-species of amphibians including endangered species in the natural environment. As a result, our proposed method is shown to achieve an accuracy of 91 % with bird data and 93 % with amphibian data. Overall, an improvement of about 6 % ~ 7 % accuracy in performance is achieved compared to the existing algorithms.

Original languageEnglish
Pages (from-to)406-413
Number of pages8
JournalJournal of the Acoustical Society of Korea
Volume39
Issue number5
DOIs
Publication statusPublished - 2020

Keywords

  • Audio event classification
  • Convolution Neural Network (CNN)
  • Gated Linear Unit (GLU)
  • Self-attention

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Instrumentation
  • Applied Mathematics
  • Signal Processing
  • Speech and Hearing

Fingerprint Dive into the research topics of 'Temporal attention based animal sound classification'. Together they form a unique fingerprint.

Cite this