Global ETD Search

891	Towards provably safe and robust learning-enabled systems Fan, Jiameng 26 August 2022 (has links) Machine learning (ML) has demonstrated great success in numerous complicated tasks. Fueled by these advances, many real-world systems like autonomous vehicles and aircraft are adopting ML techniques by adding learning-enabled components. Unfortunately, ML models widely used today, like neural networks, lack the necessary mathematical framework to provide formal guarantees on safety, causing growing concerns over these learning-enabled systems in safety-critical settings. In this dissertation, we tackle this problem by combining formal methods and machine learning to bring provable safety and robustness to learning-enabled systems. We first study the robustness verification problem of neural networks on classification tasks. We focus on providing provable safety guarantees on the absence of failures under arbitrarily strong adversaries. We propose an efficient neural network verifier LayR to compute a guaranteed and overapproximated range for the output of a neural network given an input set which contains all possible adversarially perturbed inputs. LayR relaxes nonlinear units in neural networks using linear bounds and refines such relaxations with mixed integer linear programming (MILP) to iteratively improve the approximation precision, which achieves tighter output range estimations compared to prior neural network verifiers. However, the neural network verifier focuses more on analyzing a trained neural network but less on learning provably safe neural networks. To tackle this problem, we study verifiable training that incorporates verification into training procedures to train provably safe neural networks and scale to larger models and datasets. We propose a novel framework, AdvIBP, for combining adversarial training and provable robustness verification. We show that the proposed framework can learn provable robust neural networks at a sublinear convergence rate. In the second part of the dissertation, we study the verification of system-level properties in neural-network controlled systems (NNCS). We focus on proving bounded-time safety properties by computing reachable sets. We first introduce two efficient NNCS verifiers ReachNN* and POLAR that construct polynomial-based overapproximations of neural-network controllers. We transfer NNCSs to tractable closed-loop systems with approximated polynomial controllers for computing reachable sets using existing reachability analysis tools of dynamical systems. The combination of polynomial overapproximations and reachability analysis tools opens promising directions for NNCS verification. We also include a survey and experimental study of existing NNCS verification methods, including combining state-of-the-art neural network verifiers with reachability analysis tools, to discuss what overapproximation is suitable for NNCS reachability analysis. While these verifiers enable proving safety properties of NNCS, the nonlinearity of neural-network controllers is the main bottleneck that limits their efficiency and scalability. We propose a novel framework of knowledge distillation to control “the degree of nonlinearity” of NN controllers to ease NNCS verification which improves provable safety of NNCSs especially when they are safe but cannot be verified due to their complexity. For the verification community, this opens up the possibility of reducing verification complexity by influencing how a system is trained. Though NNCS verification can prove safety when system models are known, modern deep learning, e.g., deep reinforcement learning (DRL), often targets tasks with unknown system models, also known as the model-free setting. To tackle this issue, we first focus on safe exploration of DRL and propose a novel Lyapunov-inspired method. Our method uses Gaussian Process models to provide probabilistic guarantees on the policies, and guide the exploration of the unknown environment in a safe fashion. Then, we study learning robust visual control policies in DRL to enhance the robustness against visual changes that were not seen during training. We propose a method DRIBO, which can learn robust state representations for RL via a novel contrastive version of the Multi-View Information Bottleneck (MIB). This approach enables us to train high-performance visual policies that are robust to visual distractions, and can generalize well to unseen environments. Artificial intelligence Deep neural networks Deep reinforcement learning Formal verification Machine learning Provably safe training Reachability analysis
892	Adversarial attacks and defense mechanisms to improve robustness of deep temporal point processes Samira Khorshidi (13141233) 08 September 2022 (has links) <p>Temporal point processes (TPP) are mathematical approaches for modeling asynchronous event sequences by considering the temporal dependency of each event on past events and its instantaneous rate. Temporal point processes can model various problems, from earthquake aftershocks, trade orders, gang violence, and reported crime patterns, to network analysis, infectious disease transmissions, and virus spread forecasting. In each of these cases, the entity's behavior with the corresponding information is noted over time as an asynchronous event sequence, and the analysis is done using temporal point processes, which provides a means to define the generative mechanism of the sequence of events and ultimately predict events and investigate causality.</p> <p><br></p> <p>Among point processes, Hawkes process as a stochastic point process is able to model a wide range of contagious and self-exciting patterns. One of Hawkes process's well-known applications is predicting the evolution of viral processes on networks, which is an important problem in biology, the social sciences, and the study of the Internet. In existing works, mean-field analysis based upon degree distribution is used to predict viral spreading across networks of different types. However, it has been shown that degree distribution alone fails to predict the behavior of viruses on some real-world networks. Recent attempts have been made to use assortativity to address this shortcoming. This thesis illustrates how the evolution of such a viral process is sensitive to the underlying network's structure. </p> <p><br></p> <p>In Chapter 3, we show that adding assortativity does not fully explain the variance in the spread of viruses for a number of real-world networks. We propose using the graphlet frequency distribution combined with assortativity to explain variations in the evolution of viral processes across networks with identical degree distribution. Using a data-driven approach, by coupling predictive modeling with viral process simulation on real-world networks, we show that simple regression models based on graphlet frequency distribution can explain over 95\% of the variance in virality on networks with the same degree distribution but different network topologies. Our results highlight the importance of graphlets and identify a small collection of graphlets that may have the most significant influence over the viral processes on a network.</p> <p><br></p> <p>Due to the flexibility and expressiveness of deep learning techniques, several neural network-based approaches have recently shown promise for modeling point process intensities. However, there is a lack of research on the possible adversarial attacks and the robustness of such models regarding adversarial attacks and natural shocks to systems. Furthermore, while neural point processes may outperform simpler parametric models on in-sample tests, how these models perform when encountering adversarial examples or sharp non-stationary trends remains unknown. </p> <p><br></p> <p>In Chapter 4, we propose several white-box and black-box adversarial attacks against deep temporal point processes. Additionally, we investigate the transferability of white-box adversarial attacks against point processes modeled by deep neural networks, which are considered a more elevated risk. Extensive experiments confirm that neural point processes are vulnerable to adversarial attacks. Such a vulnerability is illustrated both in terms of predictive metrics and the effect of attacks on the underlying point process's parameters. Expressly, adversarial attacks successfully transform the temporal Hawkes process regime from sub-critical to into a super-critical and manipulate the modeled parameters that is considered a risk against parametric modeling approaches. Additionally, we evaluate the vulnerability and performance of these models in the presence of non-stationary abrupt changes, using the crimes and Covid-19 pandemic dataset as an example.</p> <p><br></p> <p> Considering the security vulnerability of deep-learning models, including deep temporal point processes, to adversarial attacks, it is essential to ensure the robustness of the deployed algorithms that is despite the success of deep learning techniques in modeling temporal point processes.</p> <p> </p> <p>In Chapter 5, we study the robustness of deep temporal point processes against several proposed adversarial attacks from the adversarial defense viewpoint. Specifically, we investigate the effectiveness of adversarial training using universal adversarial samples in improving the robustness of the deep point processes. Additionally, we propose a general point process domain-adopted (GPDA) regularization, which is strictly applicable to temporal point processes, to reduce the effect of adversarial attacks and acquire an empirically robust model. In this approach, unlike other computationally expensive approaches, there is no need for additional back-propagation in the training step, and no further network is required. Ultimately, we propose an adversarial detection framework that has been trained in the Generative Adversarial Network (GAN) manner and solely on clean training data. </p> <p><br></p> <p>Finally, in Chapter 6, we discuss implications of the research and future research directions.</p> Adversarial Attack and Defense Deep Learning Applications Deep Learning Theory Point processes
893	Deep reinforcement learning for automated building climate control Snällfot, Erik, Hörnberg, Martin January 2024 (has links) The building sector is the single largest contributor to greenhouse gas emissions, making it a natural focal point for reducing energy consumption. More efficient use of energy is also becoming increasingly important for property managers as global energy prices are skyrocketing. This report is conducted on behalf of Sustainable Intelligence, a Swedish company that specializes in building automation solutions. It investigates whether deep reinforcement learning (DLR) algorithms can be implemented in a building control environment, if it can be more effective than traditional solutions, and if it can be achieved in reasonable time. The algorithms that were tested were Deep Deterministic Policy Gradient, DDPG, and Proximal Policy Optimization, PPO. They were implemented in a simulated BOPTEST environment in Brussels, Belgium, along with a traditional heating curve and a PI-controller for benchmarks. DDPG never converged, but PPO managed to reduce energy consumption compared to the best benchmark, while only having slightly worse thermal discomfort. The results indicate that DRL algorithms can be implemented in a building environment and reduce green house gas emissions in a reasonable training time. This might especially be interesting in a complex building where DRL can adapt and scale better than traditional solutions. Further research along with implementations on physical buildings need to be done in order to determine if DRL is the superior option. Machine Learning Reinforcement Learning Deep Learning Deep Reinforcement Learning Building Control Control System Engineering and Technology Teknik och teknologier Building Technologies Husbyggnad
894	Automated detection of e-scooter helmet use with deep learning Siebert, Felix W., Riis, Christoffer, Janstrup, Kira H., Kristensen, Jakob, Gül, Oguzhan, Lin, Hanhe, Hüttel, Frederik B. 19 December 2022 (has links) E-scooter riders have an increased crash risk compared to cyclists [1 ]. Hospital data finds increasing numbers of injured e-scooter riders, with head injuries as one of the most common injury types [2]. To decrease this high prevalence of head injuries, the use of e-scooter helmets could present a potential countermeasure [3]. Despite this, studies show a generally low rate of helmet use rates in countries without mandatory helmet use laws [4][5][6]. In countries with mandatory helmet use laws for e-scooter riders, helmet use rates are higher, but generally remain lower than bicycle use rates [7]. As the helmet use rate is a central factor for the safety of e-scooter riders in case of a crash and a key performance indicator in the European Commission's Road Safety Policy Framework 2021-2030 [8], efficient e-Scooter helmet use data collection methods are needed. However, currently, human observers are used to register e-scooter helmet use either in direct roadside observations or in indirect video-based observation, which is time-consuming and costly. In this study, a deep learning-based method for the automated detection of e-scooter helmet use in video data was developed and tested, with the aim to provide an efficient data collection tool for road safety researchers and practitioners.
895	Geospatial Trip Data Generation Using Deep Neural Networks / Generering av Geospatiala Resedata med Hjälp av Djupa Neurala Nätverk Deepak Udapudi, Aditya January 2022 (has links) Development of deep learning methods is dependent majorly on availability of large amounts of high quality data. To tackle the problem of data scarcity one of the workarounds is to generate synthetic data using deep learning methods. Especially, when dealing with trajectory data there are added challenges that come in to the picture such as high dependencies of the spatial and temporal component, geographical context sensitivity, privacy laws that protect an individual from being traced back to them based on their mobility patterns etc. This project is an attempt to overcome these challenges by exploring the capabilities of Generative Adversarial Networks (GANs) to generate synthetic trajectories which have characteristics close to the original trajectories. A naive model is designed as a baseline in comparison with a Long Short Term Memorys (LSTMs) based GAN. GANs are generally associated with image data and that is why Convolutional Neural Network (CNN) based GANs are very popular in recent studies. However, in this project an LSTM-based GAN was chosen to work with in order to explore its capabilities and strength of handling long-term dependencies sequential data well. The methods are evaluated using qualitative metrics of visually inspecting the trajectories on a real-world map as well as quantitative metrics by calculating the statistical distance between the underlying data distributions of the original and synthetic trajectories. Results indicate that the baseline method implemented performed better than the GAN model. The baseline model generated trajectories that had feasible spatial and temporal components, whereas the GAN model was able to learn the spatial component of the data well and not the temporal component. Conditional map information could be added as part of training the networks and this can be a research question for future work. / Utveckling av metoder för djupinlärning är till stor del beroende av tillgången på stora mängder data av hög kvalitet. För att ta itu med problemet med databrist är en av lösningarna att generera syntetisk data med hjälp av djupinlärning. Speciellt när man hanterar bana data finns det ytterligare utmaningar som kommer in i bilden såsom starka beroenden av den rumsliga och tidsmässiga komponenten, geografiska känsliga sammanhang, samt integritetslagar som skyddar en individ från att spåras tillbaka till dem baserat på deras mobilitetsmönster etc. Detta projekt är ett försök att överkomma dessa utmaningar genom att utforska kapaciteten hos generativa motståndsnätverk (GAN) för att generera syntetiska banor som har egenskaper nära de ursprungliga banorna. En naiv modell är utformad som en baslinje i jämförelse med en LSTM-baserad GAN. GAN:er är i allmänhet förknippade med bilddata och det är därför som CNN-baserade GAN:er är mycket populära i nya studier. I det här projektet valdes dock en LSTM-baserad GAN att arbeta med för att utforska dess förmåga och styrka att hantera långsiktiga beroenden och sekventiella data på ett bra sätt. Metoderna utvärderas med hjälp av kvalitativa mått för att visuellt inspektera banorna på en verklig världskarta samt kvantitativa mått genom att beräkna det statistiska avståndet mellan de underliggande datafördelningarna för de ursprungliga och syntetiska banorna. Resultaten indikerar att den implementerade baslinjemetoden fungerade bättre än GAN-modellen. Baslinjemodellen genererade banor som hade genomförbara rumsliga och tidsmässiga komponenter, medan GAN-modellen kunde lära sig den rumsliga komponenten av data väl men inte den tidsmässiga komponenten. Villkorskarta skulle kunna läggas till som en del av träningen av nätverken och detta kan vara en forskningsfråga för framtida arbete. Deep Learning Geospatial Generative Adversarial Network (GAN) Deep Learning Geospatial Generativa Motståndsnätverk (GAN) Computer and Information Sciences Data- och informationsvetenskap
896	Optimizing Accuracy-Efficiency Tradeoffs in Emerging Neural Workloads Amrit Nagarajan (17593524) 11 December 2023 (has links) <p>Deep Neural Networks (DNNs) are constantly evolving, enabling the power of deep learning to be applied to an ever-growing range of applications, such as Natural Language Processing (NLP), recommendation systems, graph processing, etc. However, these emerging neural workloads present large computational demands for both training and inference. In this dissertation, we propose optimizations that take advantage of the unique characteristics of different emerging workloads to simultaneously improve accuracy and computational efficiency.</p> <p><br></p> <p>First, we consider Language Models (LMs) used in NLP. We observe that the design process of LMs (pre-train a foundation model, and subsequently fine-tune it for different downstream tasks) leads to models that are highly over-parameterized for the downstream tasks. We propose AxFormer, a systematic framework that applies accuracy-driven approximations to create accurate and efficient LMs for a given downstream task. AxFormer eliminates task-irrelevant knowledge, and helps the model focus only on the relevant parts of the input.</p> <p><br></p> <p>Second, we find that during fine-tuning of LMs, the presence of variable-length input sequences necessitates the use of padding tokens when batching sequences, leading to ineffectual computations. It is also well known that LMs over-fit to the small task-specific training datasets used during fine-tuning, despite the use of known regularization techniques. Based on these insights, we present TokenDrop + BucketSampler, a framework that synergistically combines a new regularizer that drops a random subset of insignificant words in each sequence in every epoch, and a length-aware batching method to simultaneously reduce padding and address the overfitting issue.</p> <p><br></p> <p>Next, we address the computational challenges of Transformers used for processing inputs of several important modalities, such as text, images, audio and videos. We present Input Compression with Positional Consistency (ICPC), a new data augmentation method that applies varying levels of compression to each training sample in every epoch, thereby simultaneously reducing over-fitting and improving training efficiency. ICPC also enables efficient variable-effort inference, where easy samples can be inferred at high compression levels, and vice-versa.</p> <p><br></p> <p>Finally, we focus on optimizing Graph Neural Networks (GNNs), which are commonly used for learning on non-Euclidean data. Few-shot learning with GNNs is an important challenge, since real-world graphical data is often sparsely labeled. Self-training, wherein the GNN is trained in stages by augmenting the training data with a subset of the unlabeled data and their pseudolabels, has emerged as a promising approach. However, self-training significantly increases the computational demands of training. We propose FASTRAIN-GNN, a framework for efficient and accurate self-training of GNNs with few labeled nodes. FASTRAIN-GNN optimizes the GNN architecture, training data, training parameters, and the graph topology during self-training.</p> <p><br></p> <p>At inference time, we find that ensemble GNNs are significantly more accurate and robust than single-model GNNs, but suffer from high latency and storage requirements. To address this challenge, we propose GNN Ensembles through Error Node Isolation (GEENI). The key concept in GEENI is to identify nodes that are likely to be incorrectly classified (error nodes) and suppress their outgoing messages, leading to simultaneous accuracy and efficiency improvements. </p> <p><br></p> Natural language processing Computer vision Pattern recognition Deep Learning Accurately Deep learning efficiency Efficient training Efficient inference methods
897	Genetic and Methylation Analysis of CTNNB1 in Benign and Malignant Melanocytic Lesions Zaremba, Anne, Jansen, Philipp, Murali, Rajmohan, Mayakonda, Anand, Riedel, Anna, Krahl, Dieter, Burkhardt, Hans, John, Stefan, Géraud, Cyrill, Philip, Manuel, Kretz, Julia, Möller, Inga, Stadtler, Nadine, Sucker, Antje, Paschen, Annette, Ugurel, Selma, Zimmer, Lisa, Livingstone, Elisabeth, Horn, Susanne, Plass, Christoph, Schadendorf, Dirk, Hadaschik, Eva, Lutsik, Pavlo, Griewank, Klaus 05 December 2023 (has links) Melanocytic neoplasms have been genetically characterized in detail during the last decade. Recurrent CTNNB1 exon 3 mutations have been recognized in the distinct group of melanocytic tumors showing deep penetrating nevus-like morphology. In addition, they have been identified in 1–2% of advanced melanoma. Performing a detailed genetic analysis of difficult-to-classify nevi and melanomas with CTNNB1 mutations, we found that benign tumors (nevi) show characteristic morphological, genetic and epigenetic traits, which distinguish them from other nevi and melanoma. Malignant CTNNB1-mutant tumors (melanomas) demonstrated a different genetic profile, instead grouping clearly with other non-CTNNB1 melanomas in methylation assays. To further evaluate the role of CTNNB1 mutations in melanoma, we assessed a large cohort of clinically sequenced melanomas, identifying 38 tumors with CTNNB1 exon 3 mutations, including recurrent S45 (n = 13, 34%), G34 (n = 5, 13%), and S27 (n = 5, 13%) mutations. Locations and histological subtype of CTNNB1-mutated melanoma varied; none were reported as showing deep penetrating nevus-like morphology. The most frequent concurrent activating mutations were BRAF V600 (n = 21, 55%) and NRAS Q61 (n = 13, 34%). In our cohort, four of seven (58%) and one of nine (11%) patients treated with targeted therapy (BRAF and MEK Inhibitors) or immune-checkpoint therapy, respectively, showed disease control (partial response or stable disease). In summary, CTNNB1 mutations are associated with a unique melanocytic tumor type in benign tumors (nevi), which can be applied in a diagnostic setting. In advanced disease, no clear characteristics distinguishing CTNNB1-mutant from other melanomas were observed; however, studies of larger, optimally prospective, cohorts are warranted. info:eu-repo/classification/ddc/571.978 ddc:571.978
898	Development of a Complete Minuscule Microscope: Embedding Data Pipeline and Machine Learning Segmentation / Utveckling av ett Fullständigt Miniatyr-Mikroskop: Integrering av Dataflöde och Maskininlärningssegmentering Zec, Kenan January 2023 (has links) Cell culture is a fundamental procedure in many laboratories and precedes much research performed under the microscope. Despite the significance of this procedural stage, the monitoring of cells throughout growth is impossible due to the absence of equipment and methodological approaches. This thesis presents a low-cost, power-effective and versatile microscope with small enough dimensions to operate inside an incubator. Besides image acquisition, the microscope comprises other functions such as a data pipeline, implemented to save the images on the user’s computer via a server whilst also offering storage of the images on an integrated micro SD-card. Furthermore, a machine learning algorithm with a human-in-the-loop approach has been trained to segment the acquired images for cell proliferation and cell apoptosis tracking, and yielded promising results with an accuracy of 94%. For comparison, conventional segmentation techniques using operations such as the watershed function were deployed.The microscope described is versatile in operation as it offers the user to utilise one or more functions, depending on the purpose of the imaging. / Cellodling är en grundläggande process i många laboratiorium och föregår forskning som utförs under mikroskop. Trots inkubationens betydelse har övervakning av celler i detta skede inte varit möjlig på grund utav avsaknaden av relevant utrustning och metodologiska tillvägagångsätt. I denna examensuppsatts på avancerad nivå presenteras ett lågkostnads-, energieffektivt och versatilt mikroskop av centimeterstora dimensioner anpassat för användning i en inkubator. Förutom bildtagningsmekanismer erbjuder mikroskopet olika funktioner som till exempel ett integrerat dataflöde som möjliggör sparande av bilder på användarens dator via en server samtidigt som den erbjuder sparande av bilder på ett integrerat minneskort.Utöver detta har en human-in-the-loop maskininlärningsalgoritm för segmentation av celler implementerats i syfte att övervaka cellernas celldelning och celldöd. Denna algoritm påvisade goda resultat med en nogrannhet på 94%. I jämförelsesyfte har även en traditionell watershed-baserad cellsegmenteringsteknik utvecklats.Mikroskopet kan kallas versatilt då det tillåter användaren att anpassa dataflödet och välja vilka funktioner denne vill nyttja, allt utefter bildtagningens ändamål. Incubation-microscope Machine Learning Segmentation ESP32-Cam Deep Learning Inkubationsmikroskop Maskininlärnings-segmentering ESP32-Cam Deep Learning Physical Sciences Fysik
899	Toward the "Deep Learning" of Brain White Matter Structures Astolfi, Pietro 08 April 2022 (has links) In the brain, neuronal cells located in different functional regions communicate through a dense structural network of axons known as the white matter (WM) tissue. Bundles of axons that share similar pathways characterize the WM anatomy, which can be investigated in-vivo thanks to the recent advances of magnetic resonance (MR) techniques. Diffusion MR imaging combined with tractography pipelines allows for a virtual reconstruction of the whole WM anatomy of in-vivo brains, namely the tractogram. It consists of millions of WM fibers as 3D polylines, each approximating thousands of axons. From the analysis of a tractogram, neuroanatomists can characterize well-known white matter structures and detect anatomically non-plausible fibers, which are artifacts of the tractography and often constitute a large portion of it. The accurate characterization of tractograms is pivotal for several clinical and neuroscientific applications. However, such characterization is a complex and time-consuming process that is difficult to be automatized as it requires properly encoding well-known anatomical priors. In this thesis, we propose to investigate the encoding of anatomical priors with a supervised deep learning framework. The ultimate goal is to reduce the presence of artifactual fibers to enable a more accurate automatic process of WM characterization. We devise the problem by distinguishing between volumetric and non-volumetric representations of white matter structures. In the first case, we learn the segmentation of the WM regions that represent relevant anatomical waypoints not yet classified by WM atlases. We investigate using Convolutional Neural Networks (CNNs) to exploit the volumetric representation of such priors. In the second case, the goal is to learn from the 3D polyline representation of fibers where the typical CNN models are not suitable. We introduce the novelty of using Geometric Deep Learning (GDL) models designed to process data having an irregular representation. The working assumption is that the geometrical properties of fibers are informative for the detection of tractogram artifacts. As a first contribution, we present StemSeg that extends the use of CNNs to detect the WM portion representing the waypoints of all the fibers for a specific bundle. This anatomical landmark, called stem, can be critical for extracting that bundle. We provide the results of an empirical analysis focused on the Inferior Fronto-Occipital Fasciculus (IFOF). The effective segmentation of the stem improves the final segmentation of the IFOF, outperforming with a significant gap the reference state of the art. As a second and major contribution, we present Verifyber, a supervised tractogram filtering approach based on GDL, distinguishing between anatomically plausible and non-plausible fibers. The proposed model is designed to learn anatomical features directly from the fiber represented as a 3D points sequence. The extended empirical analysis on healthy and clinical subjects reveals multiple benefits of Verifyber: high filtering accuracy, low inference time, flexibility to different plausibility definitions, and good generalization. Overall, this thesis constitutes a step toward characterizing white matter using deep learning. It provides effective ways of encoding anatomical priors and an original deep learning model designed for fiber.
900	Online Anomaly Detection for Time Series. Towards Incorporating Feature Extraction, Model Uncertainty and Concept Drift Adaptation for Improving Anomaly Detection Tambuwal, Ahmad I. January 2021 (has links) Time series anomaly detection receives increasing research interest given the growing number of data-rich application domains. Recent additions to anomaly detection methods in research literature include deep learning algorithms. The nature and performance of these algorithms in sequence analysis enable them to learn hierarchical discriminating features and time-series temporal nature. However, their performance is affected by the speed at which the time series arrives, the use of a fixed threshold, and the assumption of Gaussian distribution on the prediction error to identify anomalous values. An exact parametric distribution is often not directly relevant in many applications and it’s often difficult to select an appropriate threshold that will differentiate anomalies with noise. Thus, implementations need the Prediction Interval (PI) that quantifies the level of uncertainty associated with the Deep Neural Network (DNN) point forecasts, which helps in making a better-informed decision and mitigates against false anomaly alerts. To achieve this, a new anomaly detection method is proposed that computes the uncertainty in estimates using quantile regression and used the quantile interval to identify anomalies. Similarly, to handle the speed at which the data arrives, an online anomaly detection method is proposed where a model is trained incrementally to adapt to the concept drift that improves prediction. This is implemented using a window-based strategy, in which a time series is broken into sliding windows of sub-sequences as input to the model. To adapt to concept drift, the model is updated when changes occur in the new arrival instances. This is achieved by using anomaly likelihood which is computed using the Q-function to define the abnormal degree of the current data point based on the previous data points. Specifically, when concept drift occurs, the proposed method will mark the current data point as anomalous. However, when the abnormal behavior continues for a longer period of time, the abnormal degree of the current data point will be low compared to the previous data points using the likelihood. As such, the current data point is added to the previous data to retrain the model which will allow the model to learn the new characteristics of the data and hence adapt to the concept changes thereby redefining the abnormal behavior. The proposed method also incorporates feature extraction to capture structural patterns in the time series. This is especially significant for multivariate time-series data, for which there is a need to capture the complex temporal dependencies that may exist between the variables. In summary, this thesis contributes to the theory, design, and development of algorithms and models for the detection of anomalies in both static and evolving time series data. Several experiments were conducted, and the results obtained indicate the significance of this research on offline and online anomaly detection in both static and evolving time-series data. In chapter 3, the newly proposed method (Deep Quantile Regression Anomaly Detection Method) is evaluated and compared with six other prediction-based anomaly detection methods that assume a normal distribution of prediction or reconstruction error for the identification of anomalies. Results in the first part of the experiment indicate that DQR-AD obtained relatively better precision than all other methods which demonstrates the capability of the method in detecting a higher number of anomalous points with low false positive rates. Also, the results show that DQR-AD is approximately 2 – 3 times better than the DeepAnT which performs better than all the remaining methods on all domains in the NAB dataset. In the second part of the experiment, sMAP dataset is used with 4-dimensional features to demonstrate the method on multivariate time-series data. Experimental result shows DQR-AD have 10% better performance than AE on three datasets (SMAP1, SMAP3, and SMAP5) and equal performance on the remaining two datasets. In chapter 5, two levels of experiments were conducted basis of false-positive rate and concept drift adaptation. In the first level of the experiment, the result shows that online DQR-AD is 18% better than both DQR-AD and VAE-LSTM on five NAB datasets. Similarly, results in the second level of the experiment show that the online DQR-AD method has better performance than five counterpart methods with a relatively 10% margin on six out of the seven NAB datasets. This result demonstrates how concept drift adaptation strategies adopted in the proposed online DQR-AD improve the performance of anomaly detection in time series. / Petroleum Technology Development Fund (PTDF) Time series Online anomaly detection Concept drift Prediction interval Deep neutral networks Uncertainty in deep learning Quantile regression

Search results