This thesis evaluates the performance of different machine learning approaches to log classification based on a dataset derived from simulating intrusive behavior towards an enterprise web application. The first experiment consists of performing attacks towards the web app in correlation with the logs to create a labeled dataset. The second experiment consists of one unsupervised model based on a variational autoencoder and four super- vised models based on both conventional feature-engineering techniques with deep neural networks and embedding-based feature techniques followed by long-short-term memory architectures and convolutional neural networks. With this dataset, the embedding-based approaches performed much better than the conventional one. The autoencoder did not perform well compared to the supervised models. To conclude, embedding-based ap- proaches show promise even on datasets with different characteristics compared to natural language.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-184768 |
Date | January 2022 |
Creators | Malmfors, Fredrik |
Publisher | Linköpings universitet, Databas och informationsteknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0031 seconds