Return to search

Forecasting Large-scale Time Series Data

The forecasting of time series data is an integral component for management, planning, and decision making in many domains. The prediction of electricity demand and supply in the energy domain or sales figures in market research are just two of the many application scenarios that require thorough predictions. Many of these domains have in common that they are influenced by the Big Data trend which also affects the time series forecasting. Data sets consist of thousands of temporal fine grained time series and have to be predicted in reasonable time. The time series may suffer from noisy behavior and missing values which makes modeling these time series especially hard, nonetheless accurate predictions are required. Furthermore, data sets from different domains exhibit various characteristics. Therefore, forecast techniques have to be flexible and adaptable to these characteristics.

Long-established forecast techniques like ARIMA and Exponential Smoothing do not fulfill these new requirements. Most of the traditional models only represent one individual time series. This makes the prediction of thousands of time series very time consuming, as an equally large number of models has to be created. Furthermore, these models do not incorporate additional data sources and are, therefore, not capable of compensating missing measurements or noisy behavior of individual time series.

In this thesis, we introduce CSAR (Cross-Sectional AutoRegression Model), a new forecast technique which is designed to address the new requirements on forecasting large-scale time series data. It is based on the novel concept of cross-sectional forecasting that assumes that time series from the same domain follow a similar behavior and represents many time series with one common model. CSAR combines this new approach with the modeling concept of ARIMA to make the model adaptable to the various properties of data sets from different domains. Furthermore, we introduce auto.CSAR, that helps to configure the model and to choose the right model components for a specific data set and forecast task.

With CSAR, we present a new forecast technique that is suited for the prediction of large-scale time series data. By representing many time series with one model, large data sets can be predicted in short time. Furthermore, using data from many time series in one model helps to compensate missing values and noisy behavior of individual series. The evaluation on three real world data sets shows that CSAR outperforms long-established forecast techniques in accuracy and execution time. Finally, with auto.CSAR, we create a way to apply CSAR to new data sets without requiring the user to have extensive knowledge about our new forecast technique and its configuration.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:32313
Date03 December 2018
CreatorsHartmann, Claudio
ContributorsLehner, Wolfgang, Günnemann, Stephan, Technische Universität Dresden
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:doctoralThesis, info:eu-repo/semantics/doctoralThesis, doc-type:Text
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0015 seconds