Return to search

Clustering and Anomaly detection using Medical Enterprise system Logs (CAMEL) / Klustring av och anomalidetektering på systemloggar

Research on automated anomaly detection in complex systems by using log files has been on an upswing with the introduction of new deep-learning natural language processing methods. However, manually identifying and labelling anomalous logs is time-consuming, error-prone, and labor-intensive. This thesis instead uses an existing state-of-the-art method which learns from PU data as a baseline and evaluates three extensions to it. The first extension provides insight into the performance of the choice of word em-beddings on the downstream task. The second extension applies a re-labelling strategy to reduce problems from pseudo-labelling. The final extension removes the need for pseudo-labelling by applying a state-of-the-art loss function from the field of PU learning. The findings show that FastText and GloVe embeddings are viable options, with FastText providing faster training times but mixed results in terms of performance. It is shown that several of the methods studied in this thesis suffer from sporadically poor performances on one of the datasets studied. Finally, it is shown that using modified risk functions from the field of PU learning provides new state-of-the-art performances on the datasets considered in this thesis.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-201631
Date January 2023
CreatorsAhlinder, Henrik, Kylesten, Tiger
PublisherLinköpings universitet, Artificiell intelligens och integrerade datorsystem
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0029 seconds