System log files are filled with logged events, status codes, and other messages. By analyzing the log files, the systems current state can be determined, and find out if something during its execution went wrong. Log file analysis has been studied for some time now, where recent studies have shown state-of-the-art performance using machine learning techniques. In this thesis, document classification solutions were tested on log files in order to classify regular system runs versus abnormal system runs. To solve this task, supervised and unsupervised learning methods were combined. Doc2Vec was used to extract document features, and Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) based architectures on the classification task. With the use of the machine learning models and preprocessing techniques the tested models yielded an f1-score and accuracy above 95% when classifying log files.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-166641 |
Date | January 2020 |
Creators | Olin, Per |
Publisher | Linköpings universitet, Institutionen för datavetenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0013 seconds