Evaluating window joins over unbounded streams

Jaewoo Kang, Jeffrey F. Naughton, Stratis D. Viglas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

218 Citations (Scopus)

Abstract

We investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using the cost model, we propose strategies for maximizing the efficiency of processing joins in three scenarios. First, we consider the case where one stream is much faster than the other. We show that asymmetric combinations of join algorithms, (e.g., hash join on one input, nested-loops join on the other) can outperform symmetric joint algorithm implementations. Second, we investigate the case where system resources are insufficient to keep up with the input streams. We show that we can maximize the number of join result tuples produced in this case by properly allocating computing resources across the two input streams. Finally, we investigate strategies for maximizing the number of result tuples produced when memory is limited, and show that proper memory allocation across the two input streams can result in significantly lower resource usage and/or more result tuples produced.

Original languageEnglish
Title of host publicationProceedings - International Conference on Data Engineering
EditorsU. Dayal, K. Ramamritham, T.M. Vijayaraman
Pages341-352
Number of pages12
DOIs
Publication statusPublished - 2003 Dec 1
Externally publishedYes
EventNineteenth International Conference on Data Ingineering - Bangalore, India
Duration: 2003 Mar 52003 Mar 8

Other

OtherNineteenth International Conference on Data Ingineering
CountryIndia
CityBangalore
Period03/3/503/3/8

Fingerprint

Storage allocation (computer)
Costs
Data storage equipment
Processing

ASJC Scopus subject areas

  • Software
  • Engineering(all)
  • Engineering (miscellaneous)

Cite this

Kang, J., Naughton, J. F., & Viglas, S. D. (2003). Evaluating window joins over unbounded streams. In U. Dayal, K. Ramamritham, & T. M. Vijayaraman (Eds.), Proceedings - International Conference on Data Engineering (pp. 341-352) https://doi.org/10.1109/ICDE.2003.1260804

Evaluating window joins over unbounded streams. / Kang, Jaewoo; Naughton, Jeffrey F.; Viglas, Stratis D.

Proceedings - International Conference on Data Engineering. ed. / U. Dayal; K. Ramamritham; T.M. Vijayaraman. 2003. p. 341-352.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kang, J, Naughton, JF & Viglas, SD 2003, Evaluating window joins over unbounded streams. in U Dayal, K Ramamritham & TM Vijayaraman (eds), Proceedings - International Conference on Data Engineering. pp. 341-352, Nineteenth International Conference on Data Ingineering, Bangalore, India, 03/3/5. https://doi.org/10.1109/ICDE.2003.1260804
Kang J, Naughton JF, Viglas SD. Evaluating window joins over unbounded streams. In Dayal U, Ramamritham K, Vijayaraman TM, editors, Proceedings - International Conference on Data Engineering. 2003. p. 341-352 https://doi.org/10.1109/ICDE.2003.1260804
Kang, Jaewoo ; Naughton, Jeffrey F. ; Viglas, Stratis D. / Evaluating window joins over unbounded streams. Proceedings - International Conference on Data Engineering. editor / U. Dayal ; K. Ramamritham ; T.M. Vijayaraman. 2003. pp. 341-352
@inproceedings{382fc8d3ae8642b3a8da64dcadeb8028,
title = "Evaluating window joins over unbounded streams",
abstract = "We investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using the cost model, we propose strategies for maximizing the efficiency of processing joins in three scenarios. First, we consider the case where one stream is much faster than the other. We show that asymmetric combinations of join algorithms, (e.g., hash join on one input, nested-loops join on the other) can outperform symmetric joint algorithm implementations. Second, we investigate the case where system resources are insufficient to keep up with the input streams. We show that we can maximize the number of join result tuples produced in this case by properly allocating computing resources across the two input streams. Finally, we investigate strategies for maximizing the number of result tuples produced when memory is limited, and show that proper memory allocation across the two input streams can result in significantly lower resource usage and/or more result tuples produced.",
author = "Jaewoo Kang and Naughton, {Jeffrey F.} and Viglas, {Stratis D.}",
year = "2003",
month = "12",
day = "1",
doi = "10.1109/ICDE.2003.1260804",
language = "English",
pages = "341--352",
editor = "U. Dayal and K. Ramamritham and T.M. Vijayaraman",
booktitle = "Proceedings - International Conference on Data Engineering",

}

TY - GEN

T1 - Evaluating window joins over unbounded streams

AU - Kang, Jaewoo

AU - Naughton, Jeffrey F.

AU - Viglas, Stratis D.

PY - 2003/12/1

Y1 - 2003/12/1

N2 - We investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using the cost model, we propose strategies for maximizing the efficiency of processing joins in three scenarios. First, we consider the case where one stream is much faster than the other. We show that asymmetric combinations of join algorithms, (e.g., hash join on one input, nested-loops join on the other) can outperform symmetric joint algorithm implementations. Second, we investigate the case where system resources are insufficient to keep up with the input streams. We show that we can maximize the number of join result tuples produced in this case by properly allocating computing resources across the two input streams. Finally, we investigate strategies for maximizing the number of result tuples produced when memory is limited, and show that proper memory allocation across the two input streams can result in significantly lower resource usage and/or more result tuples produced.

AB - We investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using the cost model, we propose strategies for maximizing the efficiency of processing joins in three scenarios. First, we consider the case where one stream is much faster than the other. We show that asymmetric combinations of join algorithms, (e.g., hash join on one input, nested-loops join on the other) can outperform symmetric joint algorithm implementations. Second, we investigate the case where system resources are insufficient to keep up with the input streams. We show that we can maximize the number of join result tuples produced in this case by properly allocating computing resources across the two input streams. Finally, we investigate strategies for maximizing the number of result tuples produced when memory is limited, and show that proper memory allocation across the two input streams can result in significantly lower resource usage and/or more result tuples produced.

UR - http://www.scopus.com/inward/record.url?scp=0344065582&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0344065582&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2003.1260804

DO - 10.1109/ICDE.2003.1260804

M3 - Conference contribution

AN - SCOPUS:0344065582

SP - 341

EP - 352

BT - Proceedings - International Conference on Data Engineering

A2 - Dayal, U.

A2 - Ramamritham, K.

A2 - Vijayaraman, T.M.

ER -