Global ETD Search

Return to search

Evaluating machine learning strategies for classification of large-scale Kubernetes cluster logs

Kubernetes is a free, open-source container orchestration system for deploying and managing Docker containers that host microservices. Its cluster logs are extremely helpful in determining the root cause of a failure. However, as systems become more complex, locating failures becomes more difficult and time-consuming. This study aims to identify the classification algorithms that accurately classify the given log data and, at the same time, require fewer computational resources. Because the data is quite large, we begin with expert-based feature selection to reduce the data size. Following that, TF-IDF feature extraction is performed, and finally, we compare five classification algorithms, SVM, KNN, random forest, gradient boosting and MLP using several metrics. The results show that Random forest produces good accuracy while requiring fewer computational resources compared to other algorithms.

http://urn.kb.se/resolve?urn=urn:nbn:se:bth-23934

Kubernetes logs

feature selection

feature extraction

multi-class classification

Computational cost

Computer Sciences

Datavetenskap (datalogi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:bth-23934
Date	January 2022
Creators	Sarika, Pawan
Publisher	Blekinge Tekniska Högskola, Institutionen för datavetenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.002 seconds

Evaluating machine learning strategies for classification of large-scale Kubernetes cluster logs

Description

Links & Downloads

Tags

Additional Fields