Deep neural network predicts emotional responses of the human brain from functional magnetic resonance imaging

Hyun Chul Kim, Peter A. Bandettini, Jong-Hwan Lee

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

An artificial neural network with multiple hidden layers (known as a deep neural network, or DNN) was employed as a predictive model (DNNp) for the first time to predict emotional responses using whole-brain functional magnetic resonance imaging (fMRI) data from individual subjects. During fMRI data acquisition, 10 healthy participants listened to 80 International Affective Digital Sound stimuli and rated their own emotions generated by each sound stimulus in terms of the arousal, dominance, and valence dimensions. The whole-brain spatial patterns from a general linear model (i.e., beta-valued maps) for each sound stimulus and the emotional response ratings were used as the input and output for the DNNP, respectively. Based on a nested five-fold cross-validation scheme, the paired input and output data were divided into training (three-fold), validation (one-fold), and test (one-fold) data. The DNNP was trained and optimized using the training and validation data and was tested using the test data. The Pearson's correlation coefficients between the rated and predicted emotional responses from our DNNP model with weight sparsity optimization (mean ± standard error 0.52 ± 0.02 for arousal, 0.51 ± 0.03 for dominance, and 0.51 ± 0.03 for valence, with an input denoising level of 0.3 and a mini-batch size of 1) were significantly greater than those of DNN models with conventional regularization schemes including elastic net regularization (0.15 ± 0.05, 0.15 ± 0.06, and 0.21 ± 0.04 for arousal, dominance, and valence, respectively), those of shallow models including logistic regression (0.11 ± 0.04, 0.10 ± 0.05, and 0.17 ± 0.04 for arousal, dominance, and valence, respectively; average of logistic regression and sparse logistic regression), and those of support vector machine-based predictive models (SVMps; 0.12 ± 0.06, 0.06 ± 0.06, and 0.10 ± 0.06 for arousal, dominance, and valence, respectively; average of linear and non-linear SVMps). This difference was confirmed to be significant with a Bonferroni-corrected p-value of less than 0.001 from a one-way analysis of variance (ANOVA) and subsequent paired t-test. The weights of the trained DNNPs were interpreted and input patterns that maximized or minimized the output of the DNNPs (i.e., the emotional responses) were estimated. Based on a binary classification of each emotion category (e.g., high arousal vs. low arousal), the error rates for the DNNP (31.2% ± 1.3% for arousal, 29.0% ± 1.7% for dominance, and 28.6% ± 3.0% for valence) were significantly lower than those for the linear SVMP (44.7% ± 2.0%, 50.7% ± 1.7%, and 47.4% ± 1.9% for arousal, dominance, and valence, respectively) and the non-linear SVMP (48.8% ± 2.3%, 52.2% ± 1.9%, and 46.4% ± 1.3% for arousal, dominance, and valence, respectively), as confirmed by the Bonferroni-corrected p < 0.001 from the one-way ANOVA. Our study demonstrates that the DNNp model is able to reveal neuronal circuitry associated with human emotional processing – including structures in the limbic and paralimbic areas, which include the amygdala, prefrontal areas, anterior cingulate cortex, insula, and caudate. Our DNNp model was also able to use activation patterns in these structures to predict and classify emotional responses to stimuli.

Original languageEnglish
Pages (from-to)607-627
Number of pages21
JournalNeuroImage
Volume186
DOIs
Publication statusPublished - 2019 Feb 1

Fingerprint

Arousal
Magnetic Resonance Imaging
Brain
Logistic Models
Analysis of Variance
Emotions
Weights and Measures
Gyrus Cinguli
Amygdala
Linear Models
Healthy Volunteers

Keywords

  • Deep learning
  • Deep neural network
  • Emotion
  • fMRI
  • Machine learning
  • Prediction
  • Regression
  • Support vector machine

ASJC Scopus subject areas

  • Neurology
  • Cognitive Neuroscience

Cite this

Deep neural network predicts emotional responses of the human brain from functional magnetic resonance imaging. / Kim, Hyun Chul; Bandettini, Peter A.; Lee, Jong-Hwan.

In: NeuroImage, Vol. 186, 01.02.2019, p. 607-627.

Research output: Contribution to journalArticle

@article{37f1e07706c74d149e7f0f70f4104cdf,
title = "Deep neural network predicts emotional responses of the human brain from functional magnetic resonance imaging",
abstract = "An artificial neural network with multiple hidden layers (known as a deep neural network, or DNN) was employed as a predictive model (DNNp) for the first time to predict emotional responses using whole-brain functional magnetic resonance imaging (fMRI) data from individual subjects. During fMRI data acquisition, 10 healthy participants listened to 80 International Affective Digital Sound stimuli and rated their own emotions generated by each sound stimulus in terms of the arousal, dominance, and valence dimensions. The whole-brain spatial patterns from a general linear model (i.e., beta-valued maps) for each sound stimulus and the emotional response ratings were used as the input and output for the DNNP, respectively. Based on a nested five-fold cross-validation scheme, the paired input and output data were divided into training (three-fold), validation (one-fold), and test (one-fold) data. The DNNP was trained and optimized using the training and validation data and was tested using the test data. The Pearson's correlation coefficients between the rated and predicted emotional responses from our DNNP model with weight sparsity optimization (mean ± standard error 0.52 ± 0.02 for arousal, 0.51 ± 0.03 for dominance, and 0.51 ± 0.03 for valence, with an input denoising level of 0.3 and a mini-batch size of 1) were significantly greater than those of DNN models with conventional regularization schemes including elastic net regularization (0.15 ± 0.05, 0.15 ± 0.06, and 0.21 ± 0.04 for arousal, dominance, and valence, respectively), those of shallow models including logistic regression (0.11 ± 0.04, 0.10 ± 0.05, and 0.17 ± 0.04 for arousal, dominance, and valence, respectively; average of logistic regression and sparse logistic regression), and those of support vector machine-based predictive models (SVMps; 0.12 ± 0.06, 0.06 ± 0.06, and 0.10 ± 0.06 for arousal, dominance, and valence, respectively; average of linear and non-linear SVMps). This difference was confirmed to be significant with a Bonferroni-corrected p-value of less than 0.001 from a one-way analysis of variance (ANOVA) and subsequent paired t-test. The weights of the trained DNNPs were interpreted and input patterns that maximized or minimized the output of the DNNPs (i.e., the emotional responses) were estimated. Based on a binary classification of each emotion category (e.g., high arousal vs. low arousal), the error rates for the DNNP (31.2{\%} ± 1.3{\%} for arousal, 29.0{\%} ± 1.7{\%} for dominance, and 28.6{\%} ± 3.0{\%} for valence) were significantly lower than those for the linear SVMP (44.7{\%} ± 2.0{\%}, 50.7{\%} ± 1.7{\%}, and 47.4{\%} ± 1.9{\%} for arousal, dominance, and valence, respectively) and the non-linear SVMP (48.8{\%} ± 2.3{\%}, 52.2{\%} ± 1.9{\%}, and 46.4{\%} ± 1.3{\%} for arousal, dominance, and valence, respectively), as confirmed by the Bonferroni-corrected p < 0.001 from the one-way ANOVA. Our study demonstrates that the DNNp model is able to reveal neuronal circuitry associated with human emotional processing – including structures in the limbic and paralimbic areas, which include the amygdala, prefrontal areas, anterior cingulate cortex, insula, and caudate. Our DNNp model was also able to use activation patterns in these structures to predict and classify emotional responses to stimuli.",
keywords = "Deep learning, Deep neural network, Emotion, fMRI, Machine learning, Prediction, Regression, Support vector machine",
author = "Kim, {Hyun Chul} and Bandettini, {Peter A.} and Jong-Hwan Lee",
year = "2019",
month = "2",
day = "1",
doi = "10.1016/j.neuroimage.2018.10.054",
language = "English",
volume = "186",
pages = "607--627",
journal = "NeuroImage",
issn = "1053-8119",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Deep neural network predicts emotional responses of the human brain from functional magnetic resonance imaging

AU - Kim, Hyun Chul

AU - Bandettini, Peter A.

AU - Lee, Jong-Hwan

PY - 2019/2/1

Y1 - 2019/2/1

N2 - An artificial neural network with multiple hidden layers (known as a deep neural network, or DNN) was employed as a predictive model (DNNp) for the first time to predict emotional responses using whole-brain functional magnetic resonance imaging (fMRI) data from individual subjects. During fMRI data acquisition, 10 healthy participants listened to 80 International Affective Digital Sound stimuli and rated their own emotions generated by each sound stimulus in terms of the arousal, dominance, and valence dimensions. The whole-brain spatial patterns from a general linear model (i.e., beta-valued maps) for each sound stimulus and the emotional response ratings were used as the input and output for the DNNP, respectively. Based on a nested five-fold cross-validation scheme, the paired input and output data were divided into training (three-fold), validation (one-fold), and test (one-fold) data. The DNNP was trained and optimized using the training and validation data and was tested using the test data. The Pearson's correlation coefficients between the rated and predicted emotional responses from our DNNP model with weight sparsity optimization (mean ± standard error 0.52 ± 0.02 for arousal, 0.51 ± 0.03 for dominance, and 0.51 ± 0.03 for valence, with an input denoising level of 0.3 and a mini-batch size of 1) were significantly greater than those of DNN models with conventional regularization schemes including elastic net regularization (0.15 ± 0.05, 0.15 ± 0.06, and 0.21 ± 0.04 for arousal, dominance, and valence, respectively), those of shallow models including logistic regression (0.11 ± 0.04, 0.10 ± 0.05, and 0.17 ± 0.04 for arousal, dominance, and valence, respectively; average of logistic regression and sparse logistic regression), and those of support vector machine-based predictive models (SVMps; 0.12 ± 0.06, 0.06 ± 0.06, and 0.10 ± 0.06 for arousal, dominance, and valence, respectively; average of linear and non-linear SVMps). This difference was confirmed to be significant with a Bonferroni-corrected p-value of less than 0.001 from a one-way analysis of variance (ANOVA) and subsequent paired t-test. The weights of the trained DNNPs were interpreted and input patterns that maximized or minimized the output of the DNNPs (i.e., the emotional responses) were estimated. Based on a binary classification of each emotion category (e.g., high arousal vs. low arousal), the error rates for the DNNP (31.2% ± 1.3% for arousal, 29.0% ± 1.7% for dominance, and 28.6% ± 3.0% for valence) were significantly lower than those for the linear SVMP (44.7% ± 2.0%, 50.7% ± 1.7%, and 47.4% ± 1.9% for arousal, dominance, and valence, respectively) and the non-linear SVMP (48.8% ± 2.3%, 52.2% ± 1.9%, and 46.4% ± 1.3% for arousal, dominance, and valence, respectively), as confirmed by the Bonferroni-corrected p < 0.001 from the one-way ANOVA. Our study demonstrates that the DNNp model is able to reveal neuronal circuitry associated with human emotional processing – including structures in the limbic and paralimbic areas, which include the amygdala, prefrontal areas, anterior cingulate cortex, insula, and caudate. Our DNNp model was also able to use activation patterns in these structures to predict and classify emotional responses to stimuli.

AB - An artificial neural network with multiple hidden layers (known as a deep neural network, or DNN) was employed as a predictive model (DNNp) for the first time to predict emotional responses using whole-brain functional magnetic resonance imaging (fMRI) data from individual subjects. During fMRI data acquisition, 10 healthy participants listened to 80 International Affective Digital Sound stimuli and rated their own emotions generated by each sound stimulus in terms of the arousal, dominance, and valence dimensions. The whole-brain spatial patterns from a general linear model (i.e., beta-valued maps) for each sound stimulus and the emotional response ratings were used as the input and output for the DNNP, respectively. Based on a nested five-fold cross-validation scheme, the paired input and output data were divided into training (three-fold), validation (one-fold), and test (one-fold) data. The DNNP was trained and optimized using the training and validation data and was tested using the test data. The Pearson's correlation coefficients between the rated and predicted emotional responses from our DNNP model with weight sparsity optimization (mean ± standard error 0.52 ± 0.02 for arousal, 0.51 ± 0.03 for dominance, and 0.51 ± 0.03 for valence, with an input denoising level of 0.3 and a mini-batch size of 1) were significantly greater than those of DNN models with conventional regularization schemes including elastic net regularization (0.15 ± 0.05, 0.15 ± 0.06, and 0.21 ± 0.04 for arousal, dominance, and valence, respectively), those of shallow models including logistic regression (0.11 ± 0.04, 0.10 ± 0.05, and 0.17 ± 0.04 for arousal, dominance, and valence, respectively; average of logistic regression and sparse logistic regression), and those of support vector machine-based predictive models (SVMps; 0.12 ± 0.06, 0.06 ± 0.06, and 0.10 ± 0.06 for arousal, dominance, and valence, respectively; average of linear and non-linear SVMps). This difference was confirmed to be significant with a Bonferroni-corrected p-value of less than 0.001 from a one-way analysis of variance (ANOVA) and subsequent paired t-test. The weights of the trained DNNPs were interpreted and input patterns that maximized or minimized the output of the DNNPs (i.e., the emotional responses) were estimated. Based on a binary classification of each emotion category (e.g., high arousal vs. low arousal), the error rates for the DNNP (31.2% ± 1.3% for arousal, 29.0% ± 1.7% for dominance, and 28.6% ± 3.0% for valence) were significantly lower than those for the linear SVMP (44.7% ± 2.0%, 50.7% ± 1.7%, and 47.4% ± 1.9% for arousal, dominance, and valence, respectively) and the non-linear SVMP (48.8% ± 2.3%, 52.2% ± 1.9%, and 46.4% ± 1.3% for arousal, dominance, and valence, respectively), as confirmed by the Bonferroni-corrected p < 0.001 from the one-way ANOVA. Our study demonstrates that the DNNp model is able to reveal neuronal circuitry associated with human emotional processing – including structures in the limbic and paralimbic areas, which include the amygdala, prefrontal areas, anterior cingulate cortex, insula, and caudate. Our DNNp model was also able to use activation patterns in these structures to predict and classify emotional responses to stimuli.

KW - Deep learning

KW - Deep neural network

KW - Emotion

KW - fMRI

KW - Machine learning

KW - Prediction

KW - Regression

KW - Support vector machine

UR - http://www.scopus.com/inward/record.url?scp=85057456558&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057456558&partnerID=8YFLogxK

U2 - 10.1016/j.neuroimage.2018.10.054

DO - 10.1016/j.neuroimage.2018.10.054

M3 - Article

C2 - 30366076

AN - SCOPUS:85057456558

VL - 186

SP - 607

EP - 627

JO - NeuroImage

JF - NeuroImage

SN - 1053-8119

ER -