Why Does a Hilbertian Metric Work Efficiently in Online Learning with Kernels?

Masahiro Yukawa, Klaus Muller

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

The autocorrelation matrix of the kernelized input vector is well approximated by the squared Gram matrix (scaled down by the dictionary size). This holds true under the condition that the input covariance matrix in the feature space is approximated by its sample estimate based on the dictionary elements, leading to a couple of fundamental insights into online learning with kernels. First, the eigenvalue spread of the autocorrelation matrix relevant to the hyperplane projection along affine subspace algorithm is approximately a square root of that for the kernel normalized least mean square algorithm. This clarifies the mechanism behind fast convergence due to the use of a Hilbertian metric. Second, for efficient function estimation, the dictionary needs to be constructed in general by taking into account the distribution of the input vector, so as to satisfy the condition. The theoretical results are justified by computer experiments.

Original languageEnglish
Article number7536151
Pages (from-to)1424-1428
Number of pages5
JournalIEEE Signal Processing Letters
Volume23
Issue number10
DOIs
Publication statusPublished - 2016 Oct 1

Fingerprint

Online Learning
Glossaries
kernel
Autocorrelation
Metric
Gram Matrix
Least Mean Square
Function Estimation
Efficient Estimation
Computer Experiments
Covariance matrix
Feature Space
Square root
Hyperplane
Subspace
Projection
Eigenvalue
Estimate
Dictionary
Experiments

Keywords

  • Kernel adaptive filter
  • online learning
  • reproducing kernel Hilbert space (RKHS)

ASJC Scopus subject areas

  • Signal Processing
  • Applied Mathematics
  • Electrical and Electronic Engineering

Cite this

Why Does a Hilbertian Metric Work Efficiently in Online Learning with Kernels? / Yukawa, Masahiro; Muller, Klaus.

In: IEEE Signal Processing Letters, Vol. 23, No. 10, 7536151, 01.10.2016, p. 1424-1428.

Research output: Contribution to journalArticle

Yukawa, Masahiro ; Muller, Klaus. / Why Does a Hilbertian Metric Work Efficiently in Online Learning with Kernels?. In: IEEE Signal Processing Letters. 2016 ; Vol. 23, No. 10. pp. 1424-1428.
@article{ba7fe8ac97de418584c6994ba661aa1c,
title = "Why Does a Hilbertian Metric Work Efficiently in Online Learning with Kernels?",
abstract = "The autocorrelation matrix of the kernelized input vector is well approximated by the squared Gram matrix (scaled down by the dictionary size). This holds true under the condition that the input covariance matrix in the feature space is approximated by its sample estimate based on the dictionary elements, leading to a couple of fundamental insights into online learning with kernels. First, the eigenvalue spread of the autocorrelation matrix relevant to the hyperplane projection along affine subspace algorithm is approximately a square root of that for the kernel normalized least mean square algorithm. This clarifies the mechanism behind fast convergence due to the use of a Hilbertian metric. Second, for efficient function estimation, the dictionary needs to be constructed in general by taking into account the distribution of the input vector, so as to satisfy the condition. The theoretical results are justified by computer experiments.",
keywords = "Kernel adaptive filter, online learning, reproducing kernel Hilbert space (RKHS)",
author = "Masahiro Yukawa and Klaus Muller",
year = "2016",
month = "10",
day = "1",
doi = "10.1109/LSP.2016.2598615",
language = "English",
volume = "23",
pages = "1424--1428",
journal = "IEEE Signal Processing Letters",
issn = "1070-9908",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "10",

}

TY - JOUR

T1 - Why Does a Hilbertian Metric Work Efficiently in Online Learning with Kernels?

AU - Yukawa, Masahiro

AU - Muller, Klaus

PY - 2016/10/1

Y1 - 2016/10/1

N2 - The autocorrelation matrix of the kernelized input vector is well approximated by the squared Gram matrix (scaled down by the dictionary size). This holds true under the condition that the input covariance matrix in the feature space is approximated by its sample estimate based on the dictionary elements, leading to a couple of fundamental insights into online learning with kernels. First, the eigenvalue spread of the autocorrelation matrix relevant to the hyperplane projection along affine subspace algorithm is approximately a square root of that for the kernel normalized least mean square algorithm. This clarifies the mechanism behind fast convergence due to the use of a Hilbertian metric. Second, for efficient function estimation, the dictionary needs to be constructed in general by taking into account the distribution of the input vector, so as to satisfy the condition. The theoretical results are justified by computer experiments.

AB - The autocorrelation matrix of the kernelized input vector is well approximated by the squared Gram matrix (scaled down by the dictionary size). This holds true under the condition that the input covariance matrix in the feature space is approximated by its sample estimate based on the dictionary elements, leading to a couple of fundamental insights into online learning with kernels. First, the eigenvalue spread of the autocorrelation matrix relevant to the hyperplane projection along affine subspace algorithm is approximately a square root of that for the kernel normalized least mean square algorithm. This clarifies the mechanism behind fast convergence due to the use of a Hilbertian metric. Second, for efficient function estimation, the dictionary needs to be constructed in general by taking into account the distribution of the input vector, so as to satisfy the condition. The theoretical results are justified by computer experiments.

KW - Kernel adaptive filter

KW - online learning

KW - reproducing kernel Hilbert space (RKHS)

UR - http://www.scopus.com/inward/record.url?scp=84990185861&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84990185861&partnerID=8YFLogxK

U2 - 10.1109/LSP.2016.2598615

DO - 10.1109/LSP.2016.2598615

M3 - Article

AN - SCOPUS:84990185861

VL - 23

SP - 1424

EP - 1428

JO - IEEE Signal Processing Letters

JF - IEEE Signal Processing Letters

SN - 1070-9908

IS - 10

M1 - 7536151

ER -