161 |
LSTM Networks for Detection and Classification of Anomalies in Raw Sensor DataVerner, Alexander 01 January 2019 (has links)
In order to ensure the validity of sensor data, it must be thoroughly analyzed for various types of anomalies. Traditional machine learning methods of anomaly detections in sensor data are based on domain-specific feature engineering. A typical approach is to use domain knowledge to analyze sensor data and manually create statistics-based features, which are then used to train the machine learning models to detect and classify the anomalies. Although this methodology is used in practice, it has a significant drawback due to the fact that feature extraction is usually labor intensive and requires considerable effort from domain experts.
An alternative approach is to use deep learning algorithms. Research has shown that modern deep neural networks are very effective in automated extraction of abstract features from raw data in classification tasks. Long short-term memory networks, or LSTMs in short, are a special kind of recurrent neural networks that are capable of learning long-term dependencies. These networks have proved to be especially effective in the classification of raw time-series data in various domains. This dissertation systematically investigates the effectiveness of the LSTM model for anomaly detection and classification in raw time-series sensor data.
As a proof of concept, this work used time-series data of sensors that measure blood glucose levels. A large number of time-series sequences was created based on a genuine medical diabetes dataset. Anomalous series were constructed by six methods that interspersed patterns of common anomaly types in the data. An LSTM network model was trained with k-fold cross-validation on both anomalous and valid series to classify raw time-series sequences into one of seven classes: non-anomalous, and classes corresponding to each of the six anomaly types.
As a control, the accuracy of detection and classification of the LSTM was compared to that of four traditional machine learning classifiers: support vector machines, Random Forests, naive Bayes, and shallow neural networks. The performance of all the classifiers was evaluated based on nine metrics: precision, recall, and the F1-score, each measured in micro, macro and weighted perspective.
While the traditional models were trained on vectors of features, derived from the raw data, that were based on knowledge of common sources of anomaly, the LSTM was trained on raw time-series data. Experimental results indicate that the performance of the LSTM was comparable to the best traditional classifiers by achieving 99% accuracy in all 9 metrics. The model requires no labor-intensive feature engineering, and the fine-tuning of its architecture and hyper-parameters can be made in a fully automated way. This study, therefore, finds LSTM networks an effective solution to anomaly detection and classification in sensor data.
|
162 |
Anomaly detection in rolling element bearings via two-dimensional Symbolic Aggregate ApproximationHarris, Bradley William 26 May 2013 (has links)
Symbolic dynamics is a current interest in the area of anomaly detection, especially in mechanical systems. Symbolic dynamics reduces the overall dimensionality of system responses while maintaining a high level of robustness to noise. Rolling element bearings are particularly common mechanical components where anomaly detection is of high importance. Harsh operating conditions and manufacturing imperfections increase vibration innately reducing component life and increasing downtime and costly repairs. This thesis presents a novel way to detect bearing vibrational anomalies through Symbolic Aggregate Approximation (SAX) in the two-dimensional time-frequency domain. SAX reduces computational requirements by partitioning high-dimensional sensor data into discrete states. This analysis specifically suits bearing vibration data in the time-frequency domain, as the distribution of data does not greatly change between normal and faulty conditions.
Under ground truth synthetically-generated experiments, two-dimensional SAX in conjunction with Markov model feature extraction is successful in detecting anomalies (> 99%) using short time spans (< 0.1 seconds) of data in the time-frequency domain with low false alarms (< 8%). Analysis of real-world datasets validates the performance over the commonly used one-dimensional symbolic analysis by detecting 100% of experimental anomalous vibration with 0 false alarms in all fault types using less than 1 second of data for the basis of 'normality'. Two-dimensional SAX also demonstrates the ability to detect anomalies in predicative monitoring environments earlier than previous methods, even in low Signal-to-Noise ratios. / Master of Science
|
163 |
Enhancing System Reliability using Abstraction and Efficient Logical Computation / 抽象化技術と高速な論理演算を利用したシステムの高信頼化Kutsuna, Takuro 24 September 2015 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第19335号 / 情博第587号 / 新制||情||102(附属図書館) / 32337 / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 山本 章博, 教授 鹿島 久嗣, 教授 五十嵐 淳 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
164 |
Toward a Hardware-assisted Online Intrusion Detection System Based on Deep Learning Algorithms for Resource-Limited Embedded SystemsAl Rawashdeh, Khaled 02 October 2018 (has links)
No description available.
|
165 |
Hierarchical Anomaly Detection for Time Series DataSperl, Ryan E. 07 June 2020 (has links)
No description available.
|
166 |
Application of Autoencoder Ensembles in Anomaly and Intrusion Detection using Time-Based AnalysisMathur, Nitin O. January 2020 (has links)
No description available.
|
167 |
Anomaly Detection in Log Files Using Machine Learning TechniquesMandagondi, Lakshmi Geethanjali January 2021 (has links)
Context: Log files are produced in most larger computer systems today which contain highly valuable information about the behavior of the system and thus they are consulted fairly often in order to analyze behavioral aspects of the system. Because of the very high number of log entries produced in some systems, it is however extremely difficult to seek out relevant information in these files. Computer-based log analysis techniques are therefore indispensable for the method of finding relevant data in log files. Objectives: The major problem is to find important events in log files. Events in the test suite such as connections error or disruption are not considered abnormal events. Rather the events which cause system interruption must be considered abnormal events. The goal is to use machine learning techniques to "learn" what an"expected" behavior of a particular test suite is. This means that the system must be able to learn to distinguish between a log file that has an anomaly, and which does not have an anomaly based on the previous sequences. Methods: Various algorithms are implemented and compared to other existing algorithms based on their performance. The algorithms are executed on a parsed set of labeled log files and are evaluated by analyzing the anomalous events contained in the log files by conducting an experiment using the algorithms. The algorithms used were Local Outlier Factor, Random Forest, and Term Frequency Inverse DocumentFrequency. We then use clustering using KMeans and PCA to gain some valuable insights from the data by observing groups of data points to find the anomalous events. Results: The results show that the Term Frequency Inverse Document Frequency method works better in finding the anomalous events in the data compared to the other two approaches after conducting an experiment which is discussed in detail. Conclusions: The results will help developers to find the anomalous events without manually looking at the log file row by row. The model provides the events which are behaving differently compared to the rest of the event in the log and that causes the system to interrupt.
|
168 |
Causal AI for Outlier Detection : Using causality to single out suspicious transactionsand identifying anomaliesVirding, Olle, Leoson, Love January 2023 (has links)
AbstractThe purpose of this thesis was to construct a program capable of detecting outliers, that is datapoints that do not follow trends that can be found within a dataset, by using Causal AI. Detectionof outliers has a very wide range of use since the term outliers can be adjusted to fit differenttypes of problems. This specific program can therefore be used in different manors to achievediverse beneficial results. In this specific thesis the program were used to detect suspicioustransactions which can eliminate unnecessary or wrongful purchases which can contribute toeconomic growth. The implementation of Causal AI was performed by using python and theDoWhy package. The Causal AI was used to determine and evaluate causal relationshipbetween input parameters in the dataset where outliers were to be detected. The identificationof outliers was then performed by letting the values of the data points be compared to theestablished causal relations. Data points that did not follow the causal flow was then labeled asoutliers. The result was a causal thinking machine learning model capable of detecting outliersas well as explaining the reason behind why the data point was labeled as an outlier. Theperformance was deemed to be satisfactory since the results seemed to follow reasonablecausal thinking as well as achieving similar results with different training data. The model turnedout to be very flexible with a wide range of uses. This flexibility was greater than what wasoriginally anticipated. Being able to replicate causal thinking using a machine learning model incombination with the models’ flexibility results in a program with such a wide area of use manydifferent problems can be automated. One example of this is the implementation of the programto make sure a sustainability policy is being followed resulting in contributing to a sustainabledevelopment in the world.
|
169 |
W2R: an ensemble Anomaly detection model inspired by language models for web application firewalls securityWang, Zelong, AnilKumar, Athira January 2023 (has links)
Nowadays, web application attacks have increased tremendously due to the large number of users and applications. Thus, industries are paying more attention to using Web application Firewalls and improving their security which acts as a shield between the app and the internet by filtering and monitoring the HTTP traffic. Most works focus on either traditional feature extraction or deep methods that require no feature extraction method. We noticed that a combination of an unsupervised language model and a classic dimension reduction method is less explored for this problem. Inspired by this gap, we propose a new unsupervised anomaly detection model with better results than the existing state-of-the-art model for anomaly detection in WAF security. This paper focuses on this structure to explore WAF security: 1) feature extraction from HTTP traffic packets by using NLP (natural language processing) methods such as word2vec and Bert, and 2) Dimension reduction by PCA and Autoencoder, 3) Using different types of anomaly detection techniques including OCSVM, isolation forest, LOF and combination of these algorithms to explore how these methods affect results. We used the datasets CSIC 2010 and ECML/PKDD 2007 in this paper, and the model has better results.
|
170 |
Low Rank and Sparse Representation for Hyperspectral Imagery AnalysisSumarsono, Alex Hendro 11 December 2015 (has links)
This dissertation develops new techniques employing the Low-rank and Sparse Representation approaches to improve the performance of state-of-the-art algorithms in hyperspectral image analysis. The contributions of this dissertation are outlined as follows. 1) Low-rank and sparse representation approaches, i.e., low-rank representation (LRR) and low-rank subspace representation (LRSR), are proposed for hyperspectral image analysis, including target and anomaly detection, estimation of the number of signal subspaces, supervised and unsupervised classification. 2) In supervised target and unsupervised anomaly detection, the performance can be improved by using the LRR sparse matrix. To further increase detection accuracy, data is partitioned into several highly-correlated groups. Target detection is performed in each group, and the final result is generated from the fusion of the output of each detector. 3) In the estimation of the number of signal subspaces, the LRSR low-rank matrix is used in conjunction with direct rank calculation and soft-thresholding. Compared to the state-of-the-art algorithms, the LRSR approach delivers the most accurate and consistent results across different datasets. 4) In supervised and unsupervised classification, the use of LRR and LRSR low-rank matrices can improve classification accuracy where the improvement of the latter is more significant. The investigation on state-of-the-art classifiers demonstrate that, as a pre-preprocessing step, the LRR and LRSR produce low-rank matrices with fewer outliers or trivial spectral variations, thereby enhancing class separability.
|
Page generated in 0.4685 seconds