Unmasking Clever Hans predictors and assessing what machines really learn

Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, Klaus Muller

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.

Original languageEnglish
Article number1096
JournalNature Communications
Volume10
Issue number1
DOIs
Publication statusPublished - 2019 Dec 1

Fingerprint

machine learning
Learning systems
problem solving
predictions
Spectrum analysis
Computer vision
intelligence
Artificial Intelligence
games
computer vision
spectrum analysis
evaluation
Machine Learning

ASJC Scopus subject areas

  • Chemistry(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Physics and Astronomy(all)

Cite this

Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., & Muller, K. (2019). Unmasking Clever Hans predictors and assessing what machines really learn. Nature Communications, 10(1), [1096]. https://doi.org/10.1038/s41467-019-08987-4

Unmasking Clever Hans predictors and assessing what machines really learn. / Lapuschkin, Sebastian; Wäldchen, Stephan; Binder, Alexander; Montavon, Grégoire; Samek, Wojciech; Muller, Klaus.

In: Nature Communications, Vol. 10, No. 1, 1096, 01.12.2019.

Research output: Contribution to journalArticle

Lapuschkin, S, Wäldchen, S, Binder, A, Montavon, G, Samek, W & Muller, K 2019, 'Unmasking Clever Hans predictors and assessing what machines really learn', Nature Communications, vol. 10, no. 1, 1096. https://doi.org/10.1038/s41467-019-08987-4
Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Muller K. Unmasking Clever Hans predictors and assessing what machines really learn. Nature Communications. 2019 Dec 1;10(1). 1096. https://doi.org/10.1038/s41467-019-08987-4
Lapuschkin, Sebastian ; Wäldchen, Stephan ; Binder, Alexander ; Montavon, Grégoire ; Samek, Wojciech ; Muller, Klaus. / Unmasking Clever Hans predictors and assessing what machines really learn. In: Nature Communications. 2019 ; Vol. 10, No. 1.
@article{9da1198a392248fbaea50716bbeb3177,
title = "Unmasking Clever Hans predictors and assessing what machines really learn",
abstract = "Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.",
author = "Sebastian Lapuschkin and Stephan W{\"a}ldchen and Alexander Binder and Gr{\'e}goire Montavon and Wojciech Samek and Klaus Muller",
year = "2019",
month = "12",
day = "1",
doi = "10.1038/s41467-019-08987-4",
language = "English",
volume = "10",
journal = "Nature Communications",
issn = "2041-1723",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - Unmasking Clever Hans predictors and assessing what machines really learn

AU - Lapuschkin, Sebastian

AU - Wäldchen, Stephan

AU - Binder, Alexander

AU - Montavon, Grégoire

AU - Samek, Wojciech

AU - Muller, Klaus

PY - 2019/12/1

Y1 - 2019/12/1

N2 - Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.

AB - Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.

UR - http://www.scopus.com/inward/record.url?scp=85062765505&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062765505&partnerID=8YFLogxK

U2 - 10.1038/s41467-019-08987-4

DO - 10.1038/s41467-019-08987-4

M3 - Article

C2 - 30858366

AN - SCOPUS:85062765505

VL - 10

JO - Nature Communications

JF - Nature Communications

SN - 2041-1723

IS - 1

M1 - 1096

ER -