XQStream++: Fast tuple extraction algorithm for streaming XML data

Byung Gul Ryu, Jongwoo Ha, Sang-Geun Lee

Research output: Contribution to journalArticle

Abstract

Tuple extraction from streaming XML should be cost effective for real-time query evaluation. Recently, StreamTX exhibits a good performance in terms of both running time and memory usage to support the tuple extraction queries for streaming XML. However, we empirically observe that StreamTX incurs computational overhead unnecessarily, since it builds on TwigStack, an XML query processing algorithm originally developed for stored XML. In this paper, we first design a non-recursive XQStream algorithm to handle inefficient recursive calls of StreamTX. Subsequently, we extend the basic XQStream by incorporating two novel schemes: (1) the relational pointer to efficiently and effectively evaluate the structural relationship of elements, and (2) the pattern reuse to reduce redundant path evaluations for pattern matching. The performance evaluation on various datasets provides new empirical findings. First, XQStream++, which incorporates the relational pointer and the pattern reuse scheme into XQStream, significantly outperforms the state-of-the-art algorithms in running time with a small, nearly constant memory usage. Second, the most recently released XQuery engines outperform StreamTX in running time.

Original languageEnglish
Pages (from-to)311-326
Number of pages16
JournalInformation Sciences
Volume314
DOIs
Publication statusPublished - 2015 Sep 1

Fingerprint

Streaming
XML
Reuse
Data storage equipment
XQuery
Query Evaluation
Query processing
Pattern matching
Query Processing
Pattern Matching
Performance Evaluation
Engine
Query
Engines
Real-time
Path
Evaluate
Evaluation
Costs

Keywords

  • Pattern reuse
  • Relational pointer
  • Streaming XML
  • Tuple extraction

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management

Cite this

XQStream++ : Fast tuple extraction algorithm for streaming XML data. / Ryu, Byung Gul; Ha, Jongwoo; Lee, Sang-Geun.

In: Information Sciences, Vol. 314, 01.09.2015, p. 311-326.

Research output: Contribution to journalArticle

@article{7cfc8b51051e4904830773bfaad3ddd0,
title = "XQStream++: Fast tuple extraction algorithm for streaming XML data",
abstract = "Tuple extraction from streaming XML should be cost effective for real-time query evaluation. Recently, StreamTX exhibits a good performance in terms of both running time and memory usage to support the tuple extraction queries for streaming XML. However, we empirically observe that StreamTX incurs computational overhead unnecessarily, since it builds on TwigStack, an XML query processing algorithm originally developed for stored XML. In this paper, we first design a non-recursive XQStream algorithm to handle inefficient recursive calls of StreamTX. Subsequently, we extend the basic XQStream by incorporating two novel schemes: (1) the relational pointer to efficiently and effectively evaluate the structural relationship of elements, and (2) the pattern reuse to reduce redundant path evaluations for pattern matching. The performance evaluation on various datasets provides new empirical findings. First, XQStream++, which incorporates the relational pointer and the pattern reuse scheme into XQStream, significantly outperforms the state-of-the-art algorithms in running time with a small, nearly constant memory usage. Second, the most recently released XQuery engines outperform StreamTX in running time.",
keywords = "Pattern reuse, Relational pointer, Streaming XML, Tuple extraction",
author = "Ryu, {Byung Gul} and Jongwoo Ha and Sang-Geun Lee",
year = "2015",
month = "9",
day = "1",
doi = "10.1016/j.ins.2014.06.041",
language = "English",
volume = "314",
pages = "311--326",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - XQStream++

T2 - Fast tuple extraction algorithm for streaming XML data

AU - Ryu, Byung Gul

AU - Ha, Jongwoo

AU - Lee, Sang-Geun

PY - 2015/9/1

Y1 - 2015/9/1

N2 - Tuple extraction from streaming XML should be cost effective for real-time query evaluation. Recently, StreamTX exhibits a good performance in terms of both running time and memory usage to support the tuple extraction queries for streaming XML. However, we empirically observe that StreamTX incurs computational overhead unnecessarily, since it builds on TwigStack, an XML query processing algorithm originally developed for stored XML. In this paper, we first design a non-recursive XQStream algorithm to handle inefficient recursive calls of StreamTX. Subsequently, we extend the basic XQStream by incorporating two novel schemes: (1) the relational pointer to efficiently and effectively evaluate the structural relationship of elements, and (2) the pattern reuse to reduce redundant path evaluations for pattern matching. The performance evaluation on various datasets provides new empirical findings. First, XQStream++, which incorporates the relational pointer and the pattern reuse scheme into XQStream, significantly outperforms the state-of-the-art algorithms in running time with a small, nearly constant memory usage. Second, the most recently released XQuery engines outperform StreamTX in running time.

AB - Tuple extraction from streaming XML should be cost effective for real-time query evaluation. Recently, StreamTX exhibits a good performance in terms of both running time and memory usage to support the tuple extraction queries for streaming XML. However, we empirically observe that StreamTX incurs computational overhead unnecessarily, since it builds on TwigStack, an XML query processing algorithm originally developed for stored XML. In this paper, we first design a non-recursive XQStream algorithm to handle inefficient recursive calls of StreamTX. Subsequently, we extend the basic XQStream by incorporating two novel schemes: (1) the relational pointer to efficiently and effectively evaluate the structural relationship of elements, and (2) the pattern reuse to reduce redundant path evaluations for pattern matching. The performance evaluation on various datasets provides new empirical findings. First, XQStream++, which incorporates the relational pointer and the pattern reuse scheme into XQStream, significantly outperforms the state-of-the-art algorithms in running time with a small, nearly constant memory usage. Second, the most recently released XQuery engines outperform StreamTX in running time.

KW - Pattern reuse

KW - Relational pointer

KW - Streaming XML

KW - Tuple extraction

UR - http://www.scopus.com/inward/record.url?scp=84929289987&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84929289987&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2014.06.041

DO - 10.1016/j.ins.2014.06.041

M3 - Article

AN - SCOPUS:84929289987

VL - 314

SP - 311

EP - 326

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

ER -