S2I-BIRD: Sound-to-image generation of bird species using generative adversarial networks

Joo Yong Shim, Joongheon Kim, Jong Kook Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Generating images from sound is a challenging task. This paper proposes a novel deep learning model that generates bird images from their corresponding sound information. Our proposed model includes a sound encoder in order to extract suitable feature representations from audio recordings, and then it generates bird images that corresponds to its calls using conditional generative adversarial networks (cGANs) with auxiliary classifiers. We demonstrate that our model produces better image generation results which outperforms other state-of-the-art methods in a similar context.

Original languageEnglish
Title of host publicationProceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2226-2232
Number of pages7
ISBN (Electronic)9781728188089
DOIs
Publication statusPublished - 2020
Event25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Milan, Italy
Duration: 2021 Jan 102021 Jan 15

Publication series

NameProceedings - International Conference on Pattern Recognition
ISSN (Print)1051-4651

Conference

Conference25th International Conference on Pattern Recognition, ICPR 2020
Country/TerritoryItaly
CityVirtual, Milan
Period21/1/1021/1/15

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'S2I-BIRD: Sound-to-image generation of bird species using generative adversarial networks'. Together they form a unique fingerprint.

Cite this