A survey on parallel training algorithms for deep neural networks

Dongsuk Yook, Hyowon Lee, In Chul Yoo

Research output: Contribution to journalArticlepeer-review

Abstract

Since a large amount of training data is typically needed to train Deep Neural Networks (DNNs), a parallel training approach is required to train the DNNs. The Stochastic Gradient Descent (SGD) algorithm is one of the most widely used methods to train the DNNs. However, since the SGD is an inherently sequential process, it requires some sort of approximation schemes to parallelize the SGD algorithm. In this paper, we review various efforts on parallelizing the SGD algorithm, and analyze the computational overhead, communication overhead, and the effects of the approximations.

Original languageEnglish
Pages (from-to)505-514
Number of pages10
JournalJournal of the Acoustical Society of Korea
Volume39
Issue number6
DOIs
Publication statusPublished - 2020

Keywords

  • Deep learning
  • Deep Neural Network (DNN)
  • Parallel processing
  • Stochastic Gradient Descent (SGD)

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Instrumentation
  • Applied Mathematics
  • Signal Processing
  • Speech and Hearing

Fingerprint Dive into the research topics of 'A survey on parallel training algorithms for deep neural networks'. Together they form a unique fingerprint.

Cite this