Return to search

Time Series Similarity Search in Distributed Key-Value Data Stores Using R-Trees

Time series data are sequences of data points collected at certain time intervals. The advance in mobile and sensor technologies has led to rapid growth in the available amount of time series data. The ability to search large time series data sets can be extremely useful in many applications. In healthcare, a system monitoring vital signals can perform a search against the past data and identify possible health threatening conditions. In engineering, a system can analyze performances of complicated equipment and identify possible failure situations or needs of maintenance based on historical data.
Existing search methods for time series data are limited in many ways. Systems utilizing memory-bound or disk-bound indexes are restricted by the resources of a single machine or hard drive. Systems that do not use indexes must search through the entire database whenever a search is requested.
The proposed system uses multidimensional index in the distributed storage environment to break the bound of one physical machine and allow for high data scalability. Utilizing an index allows the system to locate the patterns similar to the query without having to examine the entire dataset, which can significantly reduce the amount of computing resources required. The system uses an Apache HBase distributed key-value database to store the index and time series data across a cluster of machines. Evaluations were conducted to examine the system’s performance using synthesized data up to 30 million data points. The evaluation results showed that, despite some drawbacks inherited from an R-tree data structure, the system can efficiently search and retrieve patterns in large time series datasets.

Identiferoai:union.ndltd.org:unf.edu/oai:digitalcommons.unf.edu:etd-1595
Date01 January 2015
CreatorsCharapko, Aleksey
PublisherUNF Digital Commons
Source SetsUniversity of North Florida
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceUNF Theses and Dissertations

Page generated in 0.0021 seconds