Return to search

Machine Learning and Knowledge-Based Integrated Intrusion Detection Schemes

As electronic computer technology advances, files and data are kept in computers and exchanged through networks. The computer is a physically closed system for users, making it harder for others to steal data via direct touch. Computer networks, on the other hand, can be used by hackers to gain access to user accounts and steal sensitive data. The academics are concentrating their efforts on preventing network attacks and assuring data security. The Intrusion Detection System (IDS) relies on network traffic and host logs to detect and protect against network threats. They all, however, necessitate a lot of data analysis and quick reaction tactics, which puts a lot of pressure on network managers. The advancement of AI allows computers to take over difficult and time-consuming data processing activities, resulting in more intelligent network attack protection techniques and timely alerts of suspected network attacks. The SCVIC-APT-2021 dataset which is specific to the APT attacks is generated to serve as a benchmark for APT detection. A Virtual Private Network (VPN) connects two network domains to form the basic network environment for creating the dataset. Kali Linux is used as a hacker to launch multiple rounds of APT attacks and compromise two network domains from the external network. The generated dataset contains six APT stages, each of which includes different attack techniques. Following that, a knowledge-based machine learning model is proposed to detect APT attacks on the developed SCVIC-APT-2021 dataset. The macro average F1-score increases by 11.01% and reach up to 81.92% when compared to the supervised baseline model. NSL-KDD and UNSW-NB15 are then utilized as benchmarks to verify the performance of the proposed model. The weighted average F1-score on both datasets can reach 76.42% and 79.20%, respectively. Since some network attacks leave host-based information such as system logs on the network devices, the detection scheme that integrates network-based features and host-based features are used to boost the network attack detection capabilities of IDS. The raw data of CSE-CIC-IDS2018 is utilized to create the SCIVC-CIDS-2021 dataset which includes both network-based features and host-based features. To ensure precise classification results, the SCVIC-CIDS-2021 is labelled with the attacking techniques. Due to the high dimensionalities of the features in the produced dataset, Autoencoder (AE) and Gated Recurrent Unit (GRU) are employed to reduce the dimensionality of network-based and host-based features, respectively. Finally, classification of the data points is performed using knowledge-based PKI and PKI Difference (PKID) models. Among these, the PKID model performs better with a macro average F1-score of 96.60%, which is 7.62% higher than the results only utilizing network-based features.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/43763
Date06 July 2022
CreatorsShen, Yu
ContributorsKantarci, Burak, Mouftah, Hussein
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf

Page generated in 0.0022 seconds