Time series forecasting has become a common problem in day-to-day applications and various machine learning algorithms have been developed to tackle this task. Finding the model that performs the best forecasting on a given dataset can be time consuming as multiple algorithms and hyperparameter configurations must be examined to find the best model. This problem can be solved using automated machine learning, an approach that automates all steps required for developing a machine learning algorithm including finding the best algorithm and hyperparameter configuration. This study develops and builds an automated machine learning pipeline focused on finding the best forecasting model for a given dataset. This includes choosing different forecasting algorithms to cover a wide range of tasks and identifying the best method to find the best model in these algorithms. Lastly, the final pipeline will then be tested on a variety of datasets to evaluate the performance on time series data with different characteristics.:Abstract
List of Figures
List of Tables
List of Abbreviations
List of Symbols
1. Introduction
2. Theoretical Background
2.1. Machine Learning
2.2. Automated Machine Learning
2.3. Hyperparameter Optimization
2.3.1. Model-Free Methods
2.3.2. Bayesian Optimization
3. Time Series Forecasting Algorithms
3.1. Time Series Data
3.2. Baselines
3.2.1. Naive Forecast
3.2.2. Moving Average
3.3. Linear Regression
3.4. Autoregression
3.5. SARIMAX
3.6. XGBoost
3.7. LSTM Neural Network
4. Automated Machine Learning Pipeline
4.1. Data Preparation
4.2. Model Selection
4.3. Hyperparameter Optimization Method
4.3.1. Sequential Model-Based Algorithm Configuration
4.3.2. Tree-structured Parzen Estimator
4.3.3. Comparison of Bayesian Optimization Hyperparameter Optimization Methods
4.4. Pipeline Structure
5. Testing on external Datasets
5.1. Beijing PM2.5 Pollution
5.2. Perrin Freres Monthly Champagne Sales
6. Testing on internal Datasets
6.1. Deutsche Telekom Call Count
6.1.1. Comparison of Bayesian Optimization and Random Search
6.2. Deutsche Telekom Call Setup Time
7. Conclusion
Bibliography
A. Details Search Space
B. Pipeline Results - Predictions
C. Pipeline Results - Configurations
D. Pipeline Results - Experiment Details
E. Deutsche Telekom Data Usage Permissions
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:78945 |
Date | 26 April 2022 |
Creators | Rosenberger, Daniel |
Contributors | Hochschule für Technik, Wirtschaft und Kultur |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/acceptedVersion, doc-type:masterThesis, info:eu-repo/semantics/masterThesis, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0022 seconds