### Abstract

We investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using the cost model, we propose strategies for maximizing the efficiency of processing joins in three scenarios. First, we consider the case where one stream is much faster than the other. We show that asymmetric combinations of join algorithms, (e.g., hash join on one input, nested-loops join on the other) can outperform symmetric joint algorithm implementations. Second, we investigate the case where system resources are insufficient to keep up with the input streams. We show that we can maximize the number of join result tuples produced in this case by properly allocating computing resources across the two input streams. Finally, we investigate strategies for maximizing the number of result tuples produced when memory is limited, and show that proper memory allocation across the two input streams can result in significantly lower resource usage and/or more result tuples produced.

Original language | English |
---|---|

Title of host publication | Proceedings - International Conference on Data Engineering |

Editors | U. Dayal, K. Ramamritham, T.M. Vijayaraman |

Pages | 341-352 |

Number of pages | 12 |

DOIs | |

Publication status | Published - 2003 Dec 1 |

Externally published | Yes |

Event | Nineteenth International Conference on Data Ingineering - Bangalore, India Duration: 2003 Mar 5 → 2003 Mar 8 |

### Other

Other | Nineteenth International Conference on Data Ingineering |
---|---|

Country | India |

City | Bangalore |

Period | 03/3/5 → 03/3/8 |

### Fingerprint

### ASJC Scopus subject areas

- Software
- Engineering(all)
- Engineering (miscellaneous)

### Cite this

*Proceedings - International Conference on Data Engineering*(pp. 341-352) https://doi.org/10.1109/ICDE.2003.1260804

**Evaluating window joins over unbounded streams.** / Kang, Jaewoo; Naughton, Jeffrey F.; Viglas, Stratis D.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Proceedings - International Conference on Data Engineering.*pp. 341-352, Nineteenth International Conference on Data Ingineering, Bangalore, India, 03/3/5. https://doi.org/10.1109/ICDE.2003.1260804

}

TY - GEN

T1 - Evaluating window joins over unbounded streams

AU - Kang, Jaewoo

AU - Naughton, Jeffrey F.

AU - Viglas, Stratis D.

PY - 2003/12/1

Y1 - 2003/12/1

N2 - We investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using the cost model, we propose strategies for maximizing the efficiency of processing joins in three scenarios. First, we consider the case where one stream is much faster than the other. We show that asymmetric combinations of join algorithms, (e.g., hash join on one input, nested-loops join on the other) can outperform symmetric joint algorithm implementations. Second, we investigate the case where system resources are insufficient to keep up with the input streams. We show that we can maximize the number of join result tuples produced in this case by properly allocating computing resources across the two input streams. Finally, we investigate strategies for maximizing the number of result tuples produced when memory is limited, and show that proper memory allocation across the two input streams can result in significantly lower resource usage and/or more result tuples produced.

AB - We investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using the cost model, we propose strategies for maximizing the efficiency of processing joins in three scenarios. First, we consider the case where one stream is much faster than the other. We show that asymmetric combinations of join algorithms, (e.g., hash join on one input, nested-loops join on the other) can outperform symmetric joint algorithm implementations. Second, we investigate the case where system resources are insufficient to keep up with the input streams. We show that we can maximize the number of join result tuples produced in this case by properly allocating computing resources across the two input streams. Finally, we investigate strategies for maximizing the number of result tuples produced when memory is limited, and show that proper memory allocation across the two input streams can result in significantly lower resource usage and/or more result tuples produced.

UR - http://www.scopus.com/inward/record.url?scp=0344065582&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0344065582&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2003.1260804

DO - 10.1109/ICDE.2003.1260804

M3 - Conference contribution

AN - SCOPUS:0344065582

SP - 341

EP - 352

BT - Proceedings - International Conference on Data Engineering

A2 - Dayal, U.

A2 - Ramamritham, K.

A2 - Vijayaraman, T.M.

ER -