Global ETD Search

Return to search

Forecasting With Feature-Based Time Series Clustering

Time series prediction plays a pivotal role in various areas, including for example finance, weather forecasting, and traffic analysis. In this study, time series of historical sales data from a packaging manufacturer is used to investigate the effects that clustering such data has on forecasting performance. An experiment is carried out in which the time series data is first clustered using two separate approaches: k-means and Self-Organizing Map (SOM). The clustering is feature-based, meaning that characteristics extracted from the time series are used to compute similarity, rather than the raw time series. Then, A set of Long Short-Term Memory models (LSTMs) are trained; one that is trained on the entire dataset (global model), separate models trained on each of the clusters (cluster-based models), and finally a number of models trained on individual time series that are proportionally sampled from the clusters (single models). By evaluating the LSTMs based on Mean Absolute Error (MAE) and Mean Squared Error (MSE), we assess their consistency and predictive potential. The results reveal a trade-off between the consistency and predictive performance of the models. The global LSTM model consistently exhibits more stable performance across all predictions, showcasing its ability to capture the overall patterns in the data. However, the cluster-based LSTM models demonstrate potential for improved predictive performance within specific clusters, albeit with higher variability. This suggests that certain clusters possess distinct characteristics that allow for better predictions within those subsets of the data. Finally, the single LSTM models trained on individual time series, showcase even wider spreads of scores. The analysis suggests that the availability of training data plays a crucial role in the robustness (i.e., the ability to consistently produce similar results) of the forecasting models, with the global model benefiting from a larger training set. The higher variability in performance seen for the models trained with smaller training sets indicates that certain time series may be easier or harder to predict. It seems that the noise that comes with a larger training set can be either beneficial or detrimental to the predictive performance of the forecasting model on any individual time series, depending on the characteristics of that particular sample. Further analysis is required to investigate the factors contributing to the varying performance within each cluster. Exploring feature scores associated with poorly performing clusters and identifying the key features that contribute to better performance in certain clusters could provide valuable insights. Understanding these factors might aid in developing tailored strategies for cluster-specific prediction tasks.

http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-61877

demand forecasting

time series prediction

time series clustering

Computer Sciences

Datavetenskap (datalogi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:hj-61877
Date	January 2023
Creators	Tingström, Conrad, Åkerblom Svensson, Johan
Publisher	Jönköping University, Tekniska Högskolan
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0094 seconds

Forecasting With Feature-Based Time Series Clustering

Description

Links & Downloads

Tags

Additional Fields