Spelling suggestions: "subject:"probability anda estatistics"" "subject:"probability anda cstatistics""
11 |
Sample-Based Forecasting Exploiting Hierarchical Time SeriesFischer, Ulrike, Rosenthal, Frank, Lehner, Wolfgang 16 September 2022 (has links)
Time series forecasting is challenging as sophisticated forecast models are computationally expensive to build. Recent research has addressed the integration of forecasting inside a DBMS. One main benefit is that models can be created once and then repeatedly used to answer forecast queries. Often forecast queries are submitted on higher aggregation levels, e. g., forecasts of sales over all locations. To answer such a forecast query, we have two possibilities. First, we can aggregate all base time series (sales in Austria, sales in Belgium...) and create only one model for the aggregate time series. Second, we can create models for all base time series and aggregate the base forecast values. The second possibility might lead to a higher accuracy but it is usually too expensive due to a high number of base time series. However, we actually do not need all base models to achieve a high accuracy, a sample of base models is enough. With this approach, we still achieve a better accuracy than an aggregate model, very similar to using all models, but we need less models to create and maintain in the database. We further improve this approach if new actual values of the base time series arrive at different points in time. With each new actual value we can refine the aggregate forecast and eventually converge towards the real actual value. Our experimental evaluation using several real-world data sets, shows a high accuracy of our approaches and a fast convergence towards the optimal value with increasing sample sizes and increasing number of actual values respectively.
|
12 |
A System Architecture for the Monitoring of Continuous Phenomena by Sensor Data StreamsLorkowski, Peter 15 March 2019 (has links)
The monitoring of continuous phenomena like temperature, air pollution, precipitation, soil moisture etc. is of growing importance. Decreasing costs for sensors and associated infrastructure increase the availability of observational data. These data can only rarely be used directly for analysis, but need to be interpolated to cover a region in space and/or time without gaps. So the objective of monitoring in a broader sense is to provide data about the observed phenomenon in such an enhanced form. Notwithstanding the improvements in information and communication technology, monitoring always has to function under limited resources, namely: number of sensors, number of observations, computational capacity, time, data bandwidth, and storage space. To best exploit those limited resources, a monitoring system needs to strive for efficiency concerning sampling, hardware, algorithms, parameters, and storage formats. In that regard, this work proposes and evaluates solutions for several problems associated with the monitoring of continuous phenomena. Synthetic random fields can serve as reference models on which monitoring can be simulated and exactly evaluated. For this purpose, a generator is introduced that can create such fields with arbitrary dynamism and resolution. For efficient sampling, an estimator for the minimum density of observations is derived from the extension and dynamism of the observed field. In order to adapt the interpolation to the given observations, a generic algorithm for the fitting of kriging parameters is set out. A sequential model merging algorithm based on the kriging variance is introduced to mitigate big workloads and also to support subsequent and seamless updates of real-time models by new observations. For efficient storage utilization, a compression method is suggested. It is designed for the specific structure of field observations and supports progressive decompression. The unlimited diversity of possible configurations of the features above calls for an integrated approach for systematic variation and evaluation. A generic tool for organizing and manipulating configurational elements in arbitrary complex hierarchical structures is proposed. Beside the root mean square error (RMSE) as crucial quality indicator, also the computational workload is quantified in a manner that allows an analytical estimation of execution time for different parallel environments. In summary, a powerful framework for the monitoring of continuous phenomena is outlined. With its tools for systematic variation and evaluation it supports continuous efficiency improvement.
|
Page generated in 0.0923 seconds