Approximate tree kernels

Konrad Rieck, Tammo Krueger, Ulf Brefeld, Klaus Muller

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are significantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space.

Original languageEnglish
Pages (from-to)555-580
Number of pages26
JournalJournal of Machine Learning Research
Volume11
Publication statusPublished - 2010 Feb 1
Externally publishedYes

Fingerprint

Convolution
kernel
HTML
Unsupervised learning
Supervised learning
Set theory
Linear programming
Testing
Subset
Unsupervised Learning
Supervised Learning
Feature Space
Prediction
Approximation
Vertex of a graph

Keywords

  • Approximation
  • Convolution kernels
  • Kernel methods
  • Tree kernels

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Cite this

Rieck, K., Krueger, T., Brefeld, U., & Muller, K. (2010). Approximate tree kernels. Journal of Machine Learning Research, 11, 555-580.

Approximate tree kernels. / Rieck, Konrad; Krueger, Tammo; Brefeld, Ulf; Muller, Klaus.

In: Journal of Machine Learning Research, Vol. 11, 01.02.2010, p. 555-580.

Research output: Contribution to journalArticle

Rieck, K, Krueger, T, Brefeld, U & Muller, K 2010, 'Approximate tree kernels', Journal of Machine Learning Research, vol. 11, pp. 555-580.
Rieck K, Krueger T, Brefeld U, Muller K. Approximate tree kernels. Journal of Machine Learning Research. 2010 Feb 1;11:555-580.
Rieck, Konrad ; Krueger, Tammo ; Brefeld, Ulf ; Muller, Klaus. / Approximate tree kernels. In: Journal of Machine Learning Research. 2010 ; Vol. 11. pp. 555-580.
@article{3fb43f81cf024a30aed848e3824465bb,
title = "Approximate tree kernels",
abstract = "Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are significantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space.",
keywords = "Approximation, Convolution kernels, Kernel methods, Tree kernels",
author = "Konrad Rieck and Tammo Krueger and Ulf Brefeld and Klaus Muller",
year = "2010",
month = "2",
day = "1",
language = "English",
volume = "11",
pages = "555--580",
journal = "Journal of Machine Learning Research",
issn = "1532-4435",
publisher = "Microtome Publishing",

}

TY - JOUR

T1 - Approximate tree kernels

AU - Rieck, Konrad

AU - Krueger, Tammo

AU - Brefeld, Ulf

AU - Muller, Klaus

PY - 2010/2/1

Y1 - 2010/2/1

N2 - Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are significantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space.

AB - Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are significantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space.

KW - Approximation

KW - Convolution kernels

KW - Kernel methods

KW - Tree kernels

UR - http://www.scopus.com/inward/record.url?scp=77949506401&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77949506401&partnerID=8YFLogxK

M3 - Article

VL - 11

SP - 555

EP - 580

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

ER -