A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment

Minwook Chang, Youngwon Ryan Kim, Jeonghyun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Conventional methods of real time sound effects in 3D graphical and virtual environments relied upon preparing all the needed samples ahead of time and simply replaying them as needed, or parametrically modifying a basic set of samples using physically based techniques such as the spring-damper simulation and modal analysis/synthesis. In this work, we propose to apply the generative adversarial network (GAN) approach to the problem at hand, with which only one generator is trained to produce the needed sounds fast with perceptually indifferent quality. Otherwise, with the conventional methods, separate and approximate models would be needed to deal with different material properties and contact types, and manage real time performance. We demonstrate our claim by training a GAN (more specifically WaveGAN) with sounds of different drums and synthesizing the sounds on the fly for a virtual drum playing environment. The perceptual test revealed that the subjects could not discern the synthesized sounds from the ground truth nor perceived any noticeable delay upon the corresponding physical event.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages144-148
Number of pages5
ISBN (Electronic)9781538692691
DOIs
Publication statusPublished - 2019 Jan 15
Event1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 - Taichung, Taiwan, Province of China
Duration: 2018 Dec 102018 Dec 12

Publication series

NameProceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

Conference

Conference1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018
CountryTaiwan, Province of China
CityTaichung
Period18/12/1018/12/12

Fingerprint

Virtual reality
Acoustic waves
Modal analysis
Materials properties

Keywords

  • Generation of immersive environments and virtual worlds
  • Machine learning for multimodal interaction
  • Multimodal interaction and experiences in VR/AR

ASJC Scopus subject areas

  • Computer Science Applications
  • Artificial Intelligence
  • Media Technology

Cite this

Chang, M., Kim, Y. R., & Kim, J. (2019). A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment. In Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 (pp. 144-148). [8613649] (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AIVR.2018.00030

A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment. / Chang, Minwook; Kim, Youngwon Ryan; Kim, Jeonghyun.

Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc., 2019. p. 144-148 8613649 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chang, M, Kim, YR & Kim, J 2019, A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment. in Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018., 8613649, Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018, Institute of Electrical and Electronics Engineers Inc., pp. 144-148, 1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018, Taichung, Taiwan, Province of China, 18/12/10. https://doi.org/10.1109/AIVR.2018.00030
Chang M, Kim YR, Kim J. A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment. In Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc. 2019. p. 144-148. 8613649. (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018). https://doi.org/10.1109/AIVR.2018.00030
Chang, Minwook ; Kim, Youngwon Ryan ; Kim, Jeonghyun. / A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment. Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 144-148 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018).
@inproceedings{6b3d73e9ef4b4bfba03ca1422688b133,
title = "A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment",
abstract = "Conventional methods of real time sound effects in 3D graphical and virtual environments relied upon preparing all the needed samples ahead of time and simply replaying them as needed, or parametrically modifying a basic set of samples using physically based techniques such as the spring-damper simulation and modal analysis/synthesis. In this work, we propose to apply the generative adversarial network (GAN) approach to the problem at hand, with which only one generator is trained to produce the needed sounds fast with perceptually indifferent quality. Otherwise, with the conventional methods, separate and approximate models would be needed to deal with different material properties and contact types, and manage real time performance. We demonstrate our claim by training a GAN (more specifically WaveGAN) with sounds of different drums and synthesizing the sounds on the fly for a virtual drum playing environment. The perceptual test revealed that the subjects could not discern the synthesized sounds from the ground truth nor perceived any noticeable delay upon the corresponding physical event.",
keywords = "Generation of immersive environments and virtual worlds, Machine learning for multimodal interaction, Multimodal interaction and experiences in VR/AR",
author = "Minwook Chang and Kim, {Youngwon Ryan} and Jeonghyun Kim",
year = "2019",
month = "1",
day = "15",
doi = "10.1109/AIVR.2018.00030",
language = "English",
series = "Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "144--148",
booktitle = "Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018",

}

TY - GEN

T1 - A perceptual evaluation of generative adversarial network real-time synthesized drum sounds in a virtual environment

AU - Chang, Minwook

AU - Kim, Youngwon Ryan

AU - Kim, Jeonghyun

PY - 2019/1/15

Y1 - 2019/1/15

N2 - Conventional methods of real time sound effects in 3D graphical and virtual environments relied upon preparing all the needed samples ahead of time and simply replaying them as needed, or parametrically modifying a basic set of samples using physically based techniques such as the spring-damper simulation and modal analysis/synthesis. In this work, we propose to apply the generative adversarial network (GAN) approach to the problem at hand, with which only one generator is trained to produce the needed sounds fast with perceptually indifferent quality. Otherwise, with the conventional methods, separate and approximate models would be needed to deal with different material properties and contact types, and manage real time performance. We demonstrate our claim by training a GAN (more specifically WaveGAN) with sounds of different drums and synthesizing the sounds on the fly for a virtual drum playing environment. The perceptual test revealed that the subjects could not discern the synthesized sounds from the ground truth nor perceived any noticeable delay upon the corresponding physical event.

AB - Conventional methods of real time sound effects in 3D graphical and virtual environments relied upon preparing all the needed samples ahead of time and simply replaying them as needed, or parametrically modifying a basic set of samples using physically based techniques such as the spring-damper simulation and modal analysis/synthesis. In this work, we propose to apply the generative adversarial network (GAN) approach to the problem at hand, with which only one generator is trained to produce the needed sounds fast with perceptually indifferent quality. Otherwise, with the conventional methods, separate and approximate models would be needed to deal with different material properties and contact types, and manage real time performance. We demonstrate our claim by training a GAN (more specifically WaveGAN) with sounds of different drums and synthesizing the sounds on the fly for a virtual drum playing environment. The perceptual test revealed that the subjects could not discern the synthesized sounds from the ground truth nor perceived any noticeable delay upon the corresponding physical event.

KW - Generation of immersive environments and virtual worlds

KW - Machine learning for multimodal interaction

KW - Multimodal interaction and experiences in VR/AR

UR - http://www.scopus.com/inward/record.url?scp=85062185396&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062185396&partnerID=8YFLogxK

U2 - 10.1109/AIVR.2018.00030

DO - 10.1109/AIVR.2018.00030

M3 - Conference contribution

AN - SCOPUS:85062185396

T3 - Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

SP - 144

EP - 148

BT - Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -