In the era of Internet of Things (IoT) and big data, collecting, processing and analyzing enormous data faces unprecedented challenges even when being stored in preprocessed form. Anomaly detection, statistically viewed as identifying outliers having low probabilities from the modelling of data distribution p(x), becomes more crucial. In this Master thesis, two (supervised and unsupervised) novel deep anomaly detection frameworks are presented which can achieve state-of-art performance on a range of datasets.
Capsule net is an advanced artificial neural network, being able to encode intrinsic spatial relationship between parts and a whole. This property allows it to work as both a classifier and a deep autoencoder. Taking this advantage of CapsNet, a new anomaly detection technique named AnoCapsNet is proposed and three normality score functions are designed: prediction-probability-based (PP-based) normality score function, reconstruction-error-based (RE-based) normality score function, and a normality score function that combines prediction-probability-based and reconstruction-error-based together (named as PP+RE-based normality score function) for evaluating the "outlierness" of unseen images. The results on four datasets demonstrate that the PP-based method performs consistently well, while the RE-based approach is relatively sensitive to the similarity between labeled and unlabeled images. The PP+RE-based approach effectively takes advantages of both methods and achieves state-of-the-art results.
In many situations, neither the domain of anomalous samples can be fully understood, nor the domain of the normal samples is straightforward. Thus deep generative models are more suitable than supervised methods in such cases. As a variant of variational autoencoder (VAE), beta-VAE is designed for automated discovery of interpretable factorised latent representations from raw image data in a completely unsupervised manner. The t-Distributed Stochastic Neighbor Embedding (t-SNE), an unsupervised non-linear technique primarily used for data exploration and visualizing high-dimensional data, has advantages at creating a single map that reveals local and important global structure at many different scales. Taking advantages of both disentangled representation learning (using beta-VAE as an implementation) and low-dimensional neighbor embedding (using t-SNE as an implementation), another novel anomaly detection approach named AnoDM (stands for Anomaly detection based on unsupervised Disentangled representation learning and Manifold learning) is presented. A new anomaly score function is defined by combining (1) beta-VAE's reconstruction error, and (2) latent representations' distances in the t-SNE space. This is a general framework, thus any disentangled representation learning and low-dimensional embedding techniques can be applied. AnoDM is evaluated on both image and time-series data and achieves better results than models that use just one of the two measures and other existing advanced deep learning methods.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/40401 |
Date | 20 April 2020 |
Creators | Li, Xiaoyan |
Contributors | Yeap, Tet, Kiringa, Iluju |
Publisher | Université d'Ottawa / University of Ottawa |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Page generated in 0.0023 seconds