211 |
ANOMALY DETECTION USING MACHINE LEARNING FORINTRUSION DETECTIONVaishnavi Rudraraju (18431880) 02 May 2024 (has links)
<p dir="ltr">This thesis examines machine learning approaches for anomaly detection in network security, particularly focusing on intrusion detection using TCP and UDP protocols. It uses logistic regression models to effectively distinguish between normal and abnormal network actions, demonstrating a strong ability to detect possible security concerns. The study uses the UNSW-NB15 dataset for model validation, allowing a thorough evaluation of the models' capacity to detect anomalies in real-world network scenarios. The UNSW-NB15 dataset is a comprehensive network attack dataset frequently used in research to evaluate intrusion detection systems and anomaly detection algorithms because of its realistic attack scenarios and various network activities.</p><p dir="ltr">Further investigation is carried out using a Multi-Task Neural Network built for binary and multi-class classification tasks. This method allows for the in-depth study of network data, making it easier to identify potential threats. The model is fine-tuned during successive training epochs, focusing on validation measures to ensure its generalizability. The thesis also applied early stopping mechanisms to enhance the ML model, which helps optimize the training process, reduces the risk of overfitting, and improves the model's performance on new, unseen data.</p><p dir="ltr">This thesis also uses blockchain technology to track model performance indicators, a novel strategy that improves data integrity and reliability. This blockchain-based logging system keeps an immutable record of the models' performance over time, which helps to build a transparent and verifiable anomaly detection framework.</p><p dir="ltr">In summation, this research enhances Machine Learning approaches for network anomaly detection. It proposes scalable and effective approaches for early detection and mitigation of network intrusions, ultimately improving the security posture of network systems.</p>
|
212 |
Anomaly Detection Through System and Program Behavior ModelingXu, Kui 15 December 2014 (has links)
Various vulnerabilities in software applications become easy targets for attackers. The trend constantly being observed in the evolution of advanced modern exploits is their growing sophistication in stealthy attacks. Code-reuse attacks such as return-oriented programming allow intruders to execute mal-intended instruction sequences on a victim machine without injecting external code. Successful exploitation leads to hijacked applications or the download of malicious software (drive-by download attack), which usually happens without the notice or permission from users.
In this dissertation, we address the problem of host-based system anomaly detection, specifically by predicting expected behaviors of programs and detecting run-time deviations and anomalies. We first introduce an approach for detecting the drive-by download attack, which is one of the major vectors for malware infection. Our tool enforces the dependencies between user actions and system events, such as file-system access and process execution. It can be used to provide real time protection of a personal computer, as well as for diagnosing and evaluating untrusted websites for forensic purposes. We perform extensive experimental evaluation, including a user study with 21 participants, thousands of legitimate websites (for testing false alarms), 84 malicious websites in the wild, as well as lab reproduced exploits. Our solution demonstrates a usable host-based framework for controlling and enforcing the access of system resources.
Secondly, we present a new anomaly-based detection technique that probabilistically models and learns a program's control flows for high-precision behavioral reasoning and monitoring. Existing solutions suffer from either incomplete behavioral modeling (for dynamic models) or overestimating the likelihood of call occurrences (for static models).
We introduce a new probabilistic anomaly detection method for modeling program behaviors. Its uniqueness is the ability to quantify the static control flow in programs and to integrate the control flow information in probabilistic machine learning algorithms. The advantage of our technique is the significantly improved detection accuracy. We observed 11 up to 28-fold of improvement in detection accuracy compared to the state-of-the-art HMM-based anomaly models. We further integrate context information into our detection model, which achieves both strong flow-sensitivity and context-sensitivity. Our context-sensitive approach gives on average over 10 times of improvement for system call monitoring, and 3 orders of magnitude for library call monitoring, over existing regular HMM methods.
Evaluated with a large amount of program traces and real-world exploits, our findings confirm that the probabilistic modeling of program dependences provides a significant source of behavior information for building high-precision models for real-time system monitoring. Abnormal traces (obtained through reproducing exploits and synthesized abnormal traces) can be well distinguished from normal traces by our model. / Ph. D.
|
213 |
A Deep Learning Approach to Predict Accident Occurrence Based on Traffic DynamicsKhaghani, Farnaz 05 1900 (has links)
Traffic accidents are of concern for traffic safety; 1.25 million deaths are reported each year. Hence, it is crucial to have access to real-time data and rapidly detect or predict accidents. Predicting the occurrence of a highway car accident accurately any significant length of time into the future is not feasible since the vast majority of crashes occur due to unpredictable human negligence and/or error. However, rapid traffic incident detection could reduce incident-related congestion and secondary crashes, alleviate the waste of vehicles’ fuel and passengers’ time, and provide appropriate information for emergency response and field operation. While the focus of most previously proposed techniques is predicting the number of accidents in a certain region, the problem of predicting the accident occurrence or fast detection of the accident has been little studied. To address this gap, we propose a deep learning approach and build a deep neural network model based on long short term memory (LSTM). We apply it to forecast the expected speed values on freeways’ links and identify the anomalies as potential accident occurrences. Several detailed features such as weather, traffic speed, and traffic flow of upstream and downstream points are extracted from big datasets. We assess the proposed approach on a traffic dataset from Sacramento, California. The experimental results demonstrate the potential of the proposed approach in identifying the anomalies in speed value and matching them with accidents in the same area. We show that this approach can handle a high rate of rapid accident detection and be implemented in real-time travelers’ information or emergency management systems. / M.S. / Rapid traffic accident detection/prediction is essential for scaling down non-recurrent conges- tion caused by traffic accidents, avoiding secondary accidents, and accelerating emergency system responses. In this study, we propose a framework that uses large-scale historical traffic speed and traffic flow data along with the relevant weather information to obtain robust traffic patterns. The predicted traffic patterns can be coupled with the real traffic data to detect anomalous behavior that often results in traffic incidents in the roadways. Our framework consists of two major steps. First, we estimate the speed values of traffic at each point based on the historical speed and flow values of locations before and after each point on the roadway. Second, we compare the estimated values with the actual ones and introduce the ones that are significantly different as an anomaly. The anomaly points are the potential points and times that an accident occurs and causes a change in the normal behavior of the roadways. Our study shows the potential of the approach in detecting the accidents while exhibiting promising performance in detecting the accident occurrence at a time close to the actual time of occurrence.
|
214 |
Modified Kernel Principal Component Analysis and Autoencoder Approaches to Unsupervised Anomaly DetectionMerrill, Nicholas Swede 01 June 2020 (has links)
Unsupervised anomaly detection is the task of identifying examples that differ from the normal or expected pattern without the use of labeled training data. Our research addresses shortcomings in two existing anomaly detection algorithms, Kernel Principal Component Analysis (KPCA) and Autoencoders (AE), and proposes novel solutions to improve both of their performances in the unsupervised settings. Anomaly detection has several useful applications, such as intrusion detection, fault monitoring, and vision processing. More specifically, anomaly detection can be used in autonomous driving to identify obscured signage or to monitor intersections.
Kernel techniques are desirable because of their ability to model highly non-linear patterns, but they are limited in the unsupervised setting due to their sensitivity of parameter choices and the absence of a validation step. Additionally, conventionally KPCA suffers from a quadratic time and memory complexity in the construction of the gram matrix and a cubic time complexity in its eigendecomposition. The problem of tuning the Gaussian kernel parameter, $sigma$, is solved using the mini-batch stochastic gradient descent (SGD) optimization of a loss function that maximizes the dispersion of the kernel matrix entries. Secondly, the computational time is greatly reduced, while still maintaining high accuracy by using an ensemble of small, textit{skeleton} models and combining their scores. The performance of traditional machine learning approaches to anomaly detection plateaus as the volume and complexity of data increases. Deep anomaly detection (DAD) involves the applications of multilayer artificial neural networks to identify anomalous examples. AEs are fundamental to most DAD approaches. Conventional AEs rely on the assumption that a trained network will learn to reconstruct normal examples better than anomalous ones. In practice however, given sufficient capacity and training time, an AE will generalize to reconstruct even very rare examples. Three methods are introduced to more reliably train AEs for unsupervised anomaly detection: Cumulative Error Scoring (CES) leverages the entire history of training errors to minimize the importance of early stopping and Percentile Loss (PL) training aims to prevent anomalous examples from contributing to parameter updates. Lastly, early stopping via Knee detection aims to limit the risk of over training. Ultimately, the two new modified proposed methods of this research, Unsupervised Ensemble KPCA (UE-KPCA) and the modified training and scoring AE (MTS-AE), demonstrates improved detection performance and reliability compared to many baseline algorithms across a number of benchmark datasets. / Master of Science / Anomaly detection is the task of identifying examples that differ from the normal or expected pattern. The challenge of unsupervised anomaly detection is distinguishing normal and anomalous data without the use of labeled examples to demonstrate their differences. This thesis addresses shortcomings in two anomaly detection algorithms, Kernel Principal Component Analysis (KPCA) and Autoencoders (AE) and proposes new solutions to apply them in the unsupervised setting. Ultimately, the two modified methods, Unsupervised Ensemble KPCA (UE-KPCA) and the Modified Training and Scoring AE (MTS-AE), demonstrates improved detection performance and reliability compared to many baseline algorithms across a number of benchmark datasets.
|
215 |
<b>Development of a Time-Series Forecasting Model for Detecting Anomalies in Nuclear Reactor Data</b>Zachery Thomas Dahm (18422343) 22 April 2024 (has links)
<p dir="ltr">Anomaly detection systems identify abnormal behaviors, and can increase the uptime, safety, and profitability of an industrial system. This research investigates the development of an AI model for detecting anomalies in nuclear reactors. An LSTM network was used to predict the value of a key reactor signal, and then the predictions are compared to the measured values in order to determine if the data is abnormal. The predictive AI model was trained using regular operation data from the nuclear reactor at Purdue University, PUR-1. It is shown in the experiment that the model can accurately track reactor neutron counts during normal operation, with an average error of less than 5% when predicting five seconds into the future. It is also shown that the model reacts to abnormal inputs, with average errors above 50% when fed data which simulates a false data injection cyberattack. The framework of using prediction error to identify anomalies is investigated and a false positive rate of 0.2% is achieved on the normal evaluation dataset while still identifying the abnormal data as anomalous.</p>
|
216 |
Utilizing GAN and Sequence Based LSTMs on Post-RF Metadata for Near Real Time AnalysisBarnes-Cook, Blake Alexander 17 January 2023 (has links)
Wireless anomaly detection is a mature field with several unique solutions. This thesis aims to describe a novel way of detecting wireless anomalies using metadata analysis based methods. The metadata is processed and analyzed by a LSTM based Autoencoder and a LSTM based feature analyzer to produce a wide range of anomaly scores. The anomaly scores are then uploaded and analyzed to identify any anomalous fluctuations. An associated tool can also automatically download live data, train, test, and upload results to the Elasticsearch database. The overall method described is in sharp contrast to the more weathered solution of analyzing raw data from a Software Designed Radio, and has the potential to be scaled much more efficiently. / Master of Science / Wireless communications are a major part of our world. Detecting unusual changes in the wireless spectrum is therefore a high priority in maintaining networks and more. This paper describes a method that allows centralized processing of wireless network output, allowing monitoring of several areas simultaneously. This is in sharp contrast to other methods which generally must be located near the area being monitored. In addition, this implementation has the capability to be scaled more efficiently as the hardware required to monitor is less costly than the hardware required to process wireless data.
|
217 |
Robustness Studies and Training Set Analysis for HIDSHelmrich, Daniel 09 September 2024 (has links)
To enhance the protection against cyberattacks, significant research is directed towards
anomaly-based host intrusion detection systems (HIDS), which particularly appear suited for detecting zero-day attacks. This thesis addresses two problems in HIDS training sets that are often neglected in other publications: unclean and incomplete data. First, using the Leipzig Intrusion Detection - Data Set (LID-DS), a methodology to measure HIDS robustness against contaminated training data is presented. Furthermore, three baseline HIDS approaches (STIDE, SCG, and SOM) are evaluated, and robustness improvements are proposed for them. The results indicate that the baselines are not robust if test and training data share identical attacks. However, the suggested modifications, particularly the removal of anomalous threads from the training set, can enhance robustness significantly. For the problem of incomplete training data, the thesis leverages machine learning models to predict a training set’s suitability, quantified by either data drift measures or the STIDE performance. The thesis then presents rules, extracted from the best models, for assessing the suitability of new training data. Given the practical significance of both issues, for contaminated training data emphasized by the results, further research is essential. This involves examining the robustness of other HIDS algorithms, refining the proposed robustness improvements, and validating the suitability rules on other datasets, preferably real-world data.
|
218 |
Securing Cloud Containers through Intrusion Detection and RemediationAbed, Amr Sayed Omar 29 August 2017 (has links)
Linux containers are gaining increasing traction in both individual and industrial use. As these containers get integrated into mission-critical systems, real-time detection of malicious cyber attacks becomes a critical operational requirement. However, a little research has been conducted in this area.
This research introduces an anomaly-based intrusion detection and remediation system for container-based clouds. The introduced system monitors system calls between the container and the host server to passively detect malfeasance against applications running in cloud containers.
We started by applying a basic memory-based machine learning technique to model the container behavior.
The same technique was also extended to learn the behavior of a distributed application running in a number of cloud-based containers. In addition to monitoring the behavior of each container independently, the system used prior knowledge for a more informed detection system.
We then studied the feasibility and effectiveness of applying a more sophisticated deep learning technique to the same problem. We used a recurrent neural network to model the container behavior.
We evaluated the system using a typical web application hosted in two containers, one for the front-end web server, and one for the back-end database server. The system has shown promising results for both of the machine learning techniques used.
Finally, we describe a number of incident handling and remediation techniques to be applied upon attack detection. / Ph. D. / Cloud computing plays an important role in our daily lives today. Most of the online services and applications we use are hosted in a cloud environment. Examples include email, cloud storage, online booking systems, and many websites. Typically, a cloud environment would host many of those applications on a single host to maximize efficiency and minimize overhead. To achieve that, cloud service providers, such as Amazon Web Services and Google Cloud Platform, rely on virtual encapsulation environments, such as virtual machines and containers, to encapsulate and isolate applications from other applications running in the cloud.
One major concern usually raised when discussing cloud applications is the security of the application and the privacy of the data it handles, e.g. the files stored by the end users on their cloud storage. In addition to firewalls and traditional security measures that attempt to prevent an attack from affecting the application, intrusion detection systems (IDS) are usually used to detect when an application is affected by a successful attack that managed to escape the firewall. Many intrusion detection systems have been introduced to cloud applications using virtual machines, but almost none has been introduced to applications running in containers.
In this dissertation, we introduce an intrusion detection system to be deployed by cloud service providers to container-based cloud environments. The system uses machine learning techniques to learn the behavior of the application running in the container and detect when the behavior changes as an indication for a potential attack. Upon detection of the attack, the system applies one of three defense mechanisms to restore the running application to a safe state.
|
219 |
Machine Learning-Assisted Log Analysis for Uncovering AnomaliesRurling, Samuel January 2024 (has links)
Logs, which are semi-structured records of system runtime information, contain a lot of valuable insights. By looking at the logs, developers and operators can analyse their system’s behavior. This is especially necessary when something in the system goes wrong, as nonconforming logs may indicate a root cause. With the growing complexity and size of IT systems however, millions of logs are generated hourly. Reviewing them manually can therefore become an all consuming task. A potential solution to aid in log analysis is machine learning. By leveraging their ability to automatically learn from experience, machine learning algorithms can be modeled to automatically analyse logs. In this thesis, machine learning is used to perform anomaly detection, which is the discovery of so called nonconforming logs. An experiment is created in which four feature extraction methods - that is four ways of creating data representations from the logs - are tested in combination with three machine learning models. These models are: LogCluster, PCA and SVM. Additionally, a neural network architecture called an LSTM network is explored as well, a network that can craft its own features and analyse them. The results show that the LSTM performed the best, in terms of precision, recall and f1-score, followed by SVM, LogCluster and PCA, in combination with a feature extraction method using word embeddings.
|
220 |
Enhancing Computational Efficiency in Anomaly Detection with a Cascaded Machine Learning ModelYu, Teng-Sung January 2024 (has links)
This thesis presents and evaluates a new cascading machine learning model framework for anomaly detection, which are essential for modern industrial applications where computing efficiency is crucial. Traditional deep learning algorithms frequently struggle to effectively deploy in edge computing due to the limitations of processing power and memory. This study addresses the challenge by creating a cascading model framework that strategically combines lightweight and more complex models to improve the efficiency of inference while maintaining the accuracy of detection. We proposed a cascading model framework consisting of a One-Class Support Vector Machine (OCSVM) for rapid initial anomaly detection and a Variational Autoencoder (VAE) for more precise prediction in uncertain cases. The cascading technique between the OCSVM and VAE enables the system to efficiently handle regular data instances, while assigning more complex analyses only when required. This framework was tested in real-world scenarios, including anomaly detection in air pressure system of automotive industry as well as with the MNIST datasets. These tests demonstrate the framework's practical applicability and effectiveness across diverse settings, underscoring its potential for broad implementation in industrial applications.
|
Page generated in 0.0867 seconds