Return to search

Low-latency Estimates for Window-Aggregate Queries over Data Streams

Obtaining low-latency results from window-aggregate queries can be critical to certain data-stream processing applications. Due to a DSMS's lack of control over incoming data (typically, because of delays and bursts in data arrival), timely results for a window-aggregate query over a data stream cannot be obtained with guarantees about the results' accuracy. In this thesis, I propose a technique, which I term prodding, to obtain early result estimates for window-aggregate queries over data streams. The early estimates are obtained in addition to the regular query results. The proposed technique aims to maximize the contribution to a result-estimate computation from all the stateful operators across a multi-level query plan. I evaluate the benefits of prodding using real-world and generated data streams having different patterns in data arrival and data values. I conclude that, in various DSMS applications, prodding can generate low-latency estimates to window-aggregate query results. The main factors affecting the degree of inaccuracy in such estimates are: the aggregate function used in a query, the patterns in arrivals and values of stream data, and the aggressiveness of demanding the estimates. The utility of the estimates obtained using prodding should be optimized by tuning the aggressiveness in result-estimate demands to the specific latency and accuracy needs of a business, considering any available knowledge about patterns in the incoming data.

Identiferoai:union.ndltd.org:pdx.edu/oai:pdxscholar.library.pdx.edu:open_access_etds-1160
Date01 January 2011
CreatorsBhat, Amit
PublisherPDXScholar
Source SetsPortland State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceDissertations and Theses

Page generated in 0.0016 seconds