Temporal attention based animal sound classification

Jungmin Kim, Younglo Lee, Donghyeon Kim, Hanseok Ko

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


In this paper, to improve the classification accuracy of bird and amphibian acoustic sound, we utilize GLU (Gated Linear Unit) and Self-attention that encourages the network to extract important features from data and discriminate relevant important frames from all the input sequences for further performance improvement. To utilize acoustic data, we convert 1-D acoustic data to a log-Mel spectrogram. Subsequently, undesirable component such as background noise in the log-Mel spectrogram is reduced by GLU. Then, we employ the proposed temporal self-attention to improve classification accuracy. The data consist of 6-species of birds, 8-species of amphibians including endangered species in the natural environment. As a result, our proposed method is shown to achieve an accuracy of 91 % with bird data and 93 % with amphibian data. Overall, an improvement of about 6 % ~ 7 % accuracy in performance is achieved compared to the existing algorithms.

Original languageEnglish
Pages (from-to)406-413
Number of pages8
JournalJournal of the Acoustical Society of Korea
Issue number5
Publication statusPublished - 2020


  • Audio event classification
  • Convolution Neural Network (CNN)
  • Gated Linear Unit (GLU)
  • Self-attention

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Instrumentation
  • Applied Mathematics
  • Signal Processing
  • Speech and Hearing


Dive into the research topics of 'Temporal attention based animal sound classification'. Together they form a unique fingerprint.

Cite this