TY - JOUR
T1 - XQStream++
T2 - Fast tuple extraction algorithm for streaming XML data
AU - Ryu, Byung Gul
AU - Ha, Jongwoo
AU - Lee, Sang-Geun
N1 - Funding Information:
This work was in part supported by the Basic Science Research Program and the Next-Generation Information Computing Development Program through National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (Nos. NRF-2013R1A1A2061163 and 2012M3C4A7033344 ).
Publisher Copyright:
© 2014 Elsevier Inc. All rights reserved.
PY - 2015/9/1
Y1 - 2015/9/1
N2 - Tuple extraction from streaming XML should be cost effective for real-time query evaluation. Recently, StreamTX exhibits a good performance in terms of both running time and memory usage to support the tuple extraction queries for streaming XML. However, we empirically observe that StreamTX incurs computational overhead unnecessarily, since it builds on TwigStack, an XML query processing algorithm originally developed for stored XML. In this paper, we first design a non-recursive XQStream algorithm to handle inefficient recursive calls of StreamTX. Subsequently, we extend the basic XQStream by incorporating two novel schemes: (1) the relational pointer to efficiently and effectively evaluate the structural relationship of elements, and (2) the pattern reuse to reduce redundant path evaluations for pattern matching. The performance evaluation on various datasets provides new empirical findings. First, XQStream++, which incorporates the relational pointer and the pattern reuse scheme into XQStream, significantly outperforms the state-of-the-art algorithms in running time with a small, nearly constant memory usage. Second, the most recently released XQuery engines outperform StreamTX in running time.
AB - Tuple extraction from streaming XML should be cost effective for real-time query evaluation. Recently, StreamTX exhibits a good performance in terms of both running time and memory usage to support the tuple extraction queries for streaming XML. However, we empirically observe that StreamTX incurs computational overhead unnecessarily, since it builds on TwigStack, an XML query processing algorithm originally developed for stored XML. In this paper, we first design a non-recursive XQStream algorithm to handle inefficient recursive calls of StreamTX. Subsequently, we extend the basic XQStream by incorporating two novel schemes: (1) the relational pointer to efficiently and effectively evaluate the structural relationship of elements, and (2) the pattern reuse to reduce redundant path evaluations for pattern matching. The performance evaluation on various datasets provides new empirical findings. First, XQStream++, which incorporates the relational pointer and the pattern reuse scheme into XQStream, significantly outperforms the state-of-the-art algorithms in running time with a small, nearly constant memory usage. Second, the most recently released XQuery engines outperform StreamTX in running time.
KW - Pattern reuse
KW - Relational pointer
KW - Streaming XML
KW - Tuple extraction
UR - http://www.scopus.com/inward/record.url?scp=84929289987&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2014.06.041
DO - 10.1016/j.ins.2014.06.041
M3 - Article
AN - SCOPUS:84929289987
SN - 0020-0255
VL - 314
SP - 311
EP - 326
JO - Information Sciences
JF - Information Sciences
ER -