Return to search

Machine learning to detect anomalies in datacenter

This thesis investigates the possibility of using anomaly detection on performance data of virtual servers in a datacenter to detect malfunctioning servers. Using anomaly detection can potentially reduce the time a server is malfunctioning, as the server can be detected and checked before the error has a significant impact. Several approaches and methods were applied and evaluated on one virtual server: the K-nearest neighbor algorithm, the support-vector machine, the K-means clustering algorithm, self-organizing maps, CPU-memory usage ratio using a Gaussian model, and time series analysis using neural network and linear regression. The evaluation and comparison of the methods were mainly based on reported errors during the time period they were tested. The better the detected anomalies matched the reported errors the higher score they received. It turned out that anomalies in performance data could be linked to real errors in the server to some extent. This enables the possibility of using anomaly detection on performance data as a way to detect malfunctioning servers. The most simple method, looking at the ratio between memory usage and CPU, was the most successful one, detecting most errors. However the anomalies were often detected just after the error had been reported. Support vector machine were more successful at detecting anomalies before they were reported. The proportion of anomalies played a big role however and K-nearest neighbor received higher score when having a higher proportion of anomalies.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-388616
Date January 2019
CreatorsLindh, Filip
PublisherUppsala universitet, Avdelningen för systemteknik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC F, 1401-5757 ; 19040

Page generated in 0.0018 seconds