1 |
Online Anomaly Detection for Time Series. Towards Incorporating Feature Extraction, Model Uncertainty and Concept Drift Adaptation for Improving Anomaly DetectionTambuwal, Ahmad I. January 2021 (has links)
Time series anomaly detection receives increasing research interest given
the growing number of data-rich application domains. Recent additions
to anomaly detection methods in research literature include deep learning
algorithms. The nature and performance of these algorithms in sequence
analysis enable them to learn hierarchical discriminating features
and time-series temporal nature. However, their performance is affected
by the speed at which the time series arrives, the use of a fixed threshold,
and the assumption of Gaussian distribution on the prediction error
to identify anomalous values. An exact parametric distribution is often
not directly relevant in many applications and it’s often difficult to select
an appropriate threshold that will differentiate anomalies with noise.
Thus, implementations need the Prediction Interval (PI) that quantifies the
level of uncertainty associated with the Deep Neural Network (DNN) point
forecasts, which helps in making a better-informed decision and mitigates
against false anomaly alerts. To achieve this, a new anomaly detection
method is proposed that computes the uncertainty in estimates using quantile
regression and used the quantile interval to identify anomalies. Similarly,
to handle the speed at which the data arrives, an online anomaly detection
method is proposed where a model is trained incrementally to adapt
to the concept drift that improves prediction. This is implemented using a
window-based strategy, in which a time series is broken into sliding windows
of sub-sequences as input to the model. To adapt to concept drift,
the model is updated when changes occur in the new arrival instances.
This is achieved by using anomaly likelihood which is computed using the
Q-function to define the abnormal degree of the current data point based
on the previous data points. Specifically, when concept drift occurs, the
proposed method will mark the current data point as anomalous. However,
when the abnormal behavior continues for a longer period of time,
the abnormal degree of the current data point will be low compared to the
previous data points using the likelihood. As such, the current data point is
added to the previous data to retrain the model which will allow the model
to learn the new characteristics of the data and hence adapt to the concept
changes thereby redefining the abnormal behavior. The proposed method
also incorporates feature extraction to capture structural patterns in the
time series. This is especially significant for multivariate time-series data,
for which there is a need to capture the complex temporal dependencies
that may exist between the variables. In summary, this thesis contributes
to the theory, design, and development of algorithms and models for the
detection of anomalies in both static and evolving time series data.
Several experiments were conducted, and the results obtained indicate the
significance of this research on offline and online anomaly detection in
both static and evolving time-series data. In chapter 3, the newly proposed
method (Deep Quantile Regression Anomaly Detection Method) is evaluated
and compared with six other prediction-based anomaly detection
methods that assume a normal distribution of prediction or reconstruction
error for the identification of anomalies. Results in the first part of
the experiment indicate that DQR-AD obtained relatively better precision
than all other methods which demonstrates the capability of the method
in detecting a higher number of anomalous points with low false positive
rates. Also, the results show that DQR-AD is approximately 2 – 3
times better than the DeepAnT which performs better than all the remaining
methods on all domains in the NAB dataset. In the second part of the
experiment, sMAP dataset is used with 4-dimensional features to demonstrate
the method on multivariate time-series data. Experimental result
shows DQR-AD have 10% better performance than AE on three datasets
(SMAP1, SMAP3, and SMAP5) and equal performance on the remaining
two datasets. In chapter 5, two levels of experiments were conducted
basis of false-positive rate and concept drift adaptation. In the first level
of the experiment, the result shows that online DQR-AD is 18% better
than both DQR-AD and VAE-LSTM on five NAB datasets. Similarly, results
in the second level of the experiment show that the online DQR-AD
method has better performance than five counterpart methods with a relatively
10% margin on six out of the seven NAB datasets. This result
demonstrates how concept drift adaptation strategies adopted in the proposed
online DQR-AD improve the performance of anomaly detection in
time series. / Petroleum Technology Development Fund (PTDF)
|
Page generated in 0.0854 seconds