Return to search

Robustness Studies and Training Set Analysis for HIDS

To enhance the protection against cyberattacks, significant research is directed towards
anomaly-based host intrusion detection systems (HIDS), which particularly appear suited for detecting zero-day attacks. This thesis addresses two problems in HIDS training sets that are often neglected in other publications: unclean and incomplete data. First, using the Leipzig Intrusion Detection - Data Set (LID-DS), a methodology to measure HIDS robustness against contaminated training data is presented. Furthermore, three baseline HIDS approaches (STIDE, SCG, and SOM) are evaluated, and robustness improvements are proposed for them. The results indicate that the baselines are not robust if test and training data share identical attacks. However, the suggested modifications, particularly the removal of anomalous threads from the training set, can enhance robustness significantly. For the problem of incomplete training data, the thesis leverages machine learning models to predict a training set’s suitability, quantified by either data drift measures or the STIDE performance. The thesis then presents rules, extracted from the best models, for assessing the suitability of new training data. Given the practical significance of both issues, for contaminated training data emphasized by the results, further research is essential. This involves examining the robustness of other HIDS algorithms, refining the proposed robustness improvements, and validating the suitability rules on other datasets, preferably real-world data.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:93624
Date09 September 2024
CreatorsHelmrich, Daniel
ContributorsUniversität Leipzig
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/publishedVersion, doc-type:masterThesis, info:eu-repo/semantics/masterThesis, doc-type:Text
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0019 seconds