Return to search

Auto-scaling Prediction using MachineLearning Algorithms : Analysing Performance and Feature Correlation

Despite Covid-19’s drawbacks, it has recently contributed to highlighting the significance of cloud computing. The great majority of enterprises and organisations have shifted to a hybrid mode that enables users or workers to access their work environment from any location. This made it possible for businesses to save on-premises costs by moving their operations to the cloud. It has become essential to allocate resources effectively, especially through predictive auto-scaling. Although many algorithms have been studied regarding predictive auto-scaling, further analysis and validation need to be done. The objectives of this thesis are to implement machine-learning algorithms for predicting auto-scaling and to compare their performance on common grounds. The secondary objective is to find data connections amongst features within the dataset and evaluate their correlation coefficients. The methodology adopted for this thesis is experimentation. The selection of experimentation was made so that the auto-scaling algorithms can be tested in practical situations and compared to the results to identify the best algorithm using the selected metrics. This experiment can assist in determining whether the algorithms operate as predicted. Metrics such as Accuracy, F1-Score, Precision, Recall, Training Time andRoot Mean Square Error(RMSE) are calculated for the chosen algorithms RandomForest(RF), Logistic Regression, Support Vector Machine and Naive Bayes Classifier. The correlation coefficients of the features in the data are also measured, which helped in increasing the accuracy of the machine learning model. In conclusion, the features related to our target variable(CPU us-age, p95_scaling) often had high correlation coefficients compared to other features. The relationships between these variables could potentially be influenced by other variables that are unrelated to the target variable. Also, from the experimentation, it can be seen that the optimal algorithm for determining how cloud resources should be scaled is the Random Forest Classifier.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:bth-25500
Date January 2023
CreatorsAhmed, Syed Saif, Arepalli, Harshini Devi
PublisherBlekinge Tekniska Högskola, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds