Return to search

Real-time data stream clustering over sliding windows

In many applications, e.g. urban traffic monitoring, stock trading, and industrial sensor data monitoring, clustering algorithms are applied on data streams in real-time to find current patterns. Here, sliding windows are commonly used as they capture concept drift. Real-time clustering over sliding windows is early detection of continuously evolving clusters as soon as they occur in the stream, which requires efficient maintenance of cluster memberships that change as windows slide. Data stream management systems (DSMSs) provide high-level query languages for searching and analyzing streaming data. In this thesis we extend a DSMS with a real-time data stream clustering framework called Generic 2-phase Continuous Summarization framework (G2CS).  G2CS modularizes data stream clustering by taking as input clustering algorithms which are expressed in terms of a number of functions and indexing structures. G2CS supports real-time clustering by efficient window sliding mechanism and algorithm transparent indexing. A particular challenge for real-time detection of a high number of rapidly evolving clusters is efficiency of window slides for clustering algorithms where deletion of expired data is not supported, e.g. BIRCH. To that end, G2CS includes a novel window maintenance mechanism called Sliding Binary Merge (SBM). To further improve real-time sliding performance, G2CS uses generation-based multi-dimensional indexing where indexing structures suitable for the clustering algorithms can be plugged-in.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-302799
Date January 2016
CreatorsBadiozamany, Sobhan
PublisherUppsala universitet, Avdelningen för datalogi, Uppsala universitet, Datalogi, Uppsala
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeDoctoral thesis, comprehensive summary, info:eu-repo/semantics/doctoralThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationDigital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, 1651-6214 ; 1431

Page generated in 0.0026 seconds