Global ETD Search

Return to search

Clustering and Anomaly detection using Medical Enterprise system Logs (CAMEL) / Klustring av och anomalidetektering på systemloggar

Research on automated anomaly detection in complex systems by using log files has been on an upswing with the introduction of new deep-learning natural language processing methods. However, manually identifying and labelling anomalous logs is time-consuming, error-prone, and labor-intensive. This thesis instead uses an existing state-of-the-art method which learns from PU data as a baseline and evaluates three extensions to it. The first extension provides insight into the performance of the choice of word em-beddings on the downstream task. The second extension applies a re-labelling strategy to reduce problems from pseudo-labelling. The final extension removes the need for pseudo-labelling by applying a state-of-the-art loss function from the field of PU learning. The findings show that FastText and GloVe embeddings are viable options, with FastText providing faster training times but mixed results in terms of performance. It is shown that several of the methods studied in this thesis suffer from sporadically poor performances on one of the datasets studied. Finally, it is shown that using modified risk functions from the field of PU learning provides new state-of-the-art performances on the datasets considered in this thesis.

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-201631

Natural Language processing

NLP

Anomaly detection

log anomaly detection

Positive-Unlabelled learning

Positive Unlabelled learning

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-201631
Date	January 2023
Creators	Ahlinder, Henrik, Kylesten, Tiger
Publisher	Linköpings universitet, Artificiell intelligens och integrerade datorsystem
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds

Clustering and Anomaly detection using Medical Enterprise system Logs (CAMEL) / Klustring av och anomalidetektering på systemloggar

Description

Links & Downloads

Tags

Additional Fields