Global ETD Search

151	Anomaly or not Anomaly, that is the Question of Uncertainty : Investigating the relation between model uncertainty and anomalies using a recurrent autoencoder approach to market time series Vidmark, Anton January 2022 (has links) Knowing when one does not know is crucial in decision making. By estimating uncertainties humans can recognize novelty both by intuition and reason, but most AI systems lack this self-reflective ability. In anomaly detection, a common approach is to train a model to learn the distinction between some notion of normal and some notion of anomalies. In contrast, we let the models build their own notion of normal by learning directly from the data in a self-supervised manner, and by introducing estimations of model uncertainty the models can recognize themselves when novel situations are encountered. In our work, the aim is to investigate the relationship between model uncertainty and anomalies in time series data. We develop a method based on a recurrent autoencoder approach, and we design an anomaly score function that aggregates model error with model uncertainty to indicate anomalies. Use the Monte Carlo Dropout as Bayesian approximation to derive model uncertainty. Asa proof of concept we evaluate our method qualitatively on real-world complex time series using stock market data. Results show that our method can identify extreme events in the stock market. We conclude that the relation between model uncertainty and anomalies can be utilized for anomaly detection in time series data. Uncertainty in deep learning Bayesian anomaly detection novelty detection stock market time series Computer Sciences Datavetenskap (datalogi)
152	Pattern-of-life extraction and anomaly detection using GMTI data Liu, Tsa Chun January 2019 (has links) Ground Moving Target Indicator (GMTI) uses the concept of airborne surveillance of moving ground objects to observe and take actions accordingly. This concept was established in the late 20th century and was put to test during the Gulf War to observe enemy movement on the other side of the mountain. During the war, due to limitations of technology, information such as enemy movement were usually observed through human readings. With the improvement of surveillance technology, tracking individual target became possible, which allows the extraction of useful features for advance usage. Such features, known as tracks, are the results of GMTI tracking. Although the quality of the tracker plays a crucial role in the system performance of this paper, the development of the tracker is not discussed in this paper. The developed system will use simulated ideal GMTI tracks as input dataset. This paper presents an end-to-end system that includes Anomaly GMTI (AGMTI) track simulation, Pattern of Life (PoL) extraction and Anomaly Detection System (ADS). All the subsystems (AGMTI, PoL and ADS) are independent of each other, so they can either be replaced or disabled to resemble different real-world scenarios. The results from AGMTI will provide inputs for the rest of the subsystems. The results from PoL extraction will be used to improve the performance of ADS. The proposed ADS is a semi-supervised learning detection system in which the system takes prior information to support and improve detection performance, but will still operate without prior information. The AGMTI tracks simulator will be simulated with an open-sourced software called Simulation of Urban Traffic (SUMO). The AGMTI tracks simulator subsystem will make use of SUMO's API to generate normal and anomaly GMTI tracks. The PoL extraction will be accomplished by using various clustering algorithms and statistical functions. The ADS will use combination of various anomaly detection algorithms for different anomaly events including statistical approach using Gaussian Mixture Model Expectation Maximization (GMM-EM), Hidden Markov Model (HMM), graphical approach using Weiler-Atherton Polygon Clipping (WAPC) and various clustering algorithms such as K-means clustering, Spectral clustering and DBSCAN. Finally, as extensions to the proposed system, this paper also presents Contextual Pattern of Life (CPoL) and Grouped Anomaly Detection. The CPoL is an extension to the PoL to enhance the quality and robustness of the extraction. The Grouped Anomaly is extension to both AGMTI track simulator and ADS to diversify the possible scenarios. The results from the ADS will be evaluated. Details of implementation will be provided so the system can be replicated. / Thesis / Master of Applied Science (MASc) Ground moving target indicator Pattern-of-life Clustering Anomaly detection Traffic Classification Simulation Model
153	Early Anomaly Detection in Electrical Bushings Manufacturing at Hitachi Energy Quintero Suárez, Felipe January 2022 (has links) The manufacturing of electrical bushings for high voltages is complicated and highly demanding technology-wise. This process has more than 10 steps where a single mistake in the chain could cause a complete failure of the final product. A faulty bushing represents high costs to the company both economically and in terms of public image. Nowadays, fault detection is corrective-oriented, which means that there is low traceability on where the problem happens, and it is only detected once the final product is tested. This thesis aims to test a machine learning tool from Imagimob® to determine if is possible to detect faults during the manufacturing process using the existing captured data. To perform the test, a sample from 2019 was taken where the production of the bushings reached a 60% scrap rate. A deep-learning neural network with a 2D convolutional layer was implemented. The outcome of the system showed an efficiency of 80%. However, due to the complexity of the bushing manufacturing process, the few data samples, and the addition of different factors that can result in a faulty bushing, a range of probability is set depending on the number of anomalies detected. With such validation, the tool can label 18% of the bushings as surely faulty, and 27% as most likely faulty. The limitation of the tool is that the information must be analyzed after each step is done, and not continuously. Hence further research should be carried out on implementing a real-time tool. / Tillverkningen av elektriska genomföringar för högspänning är komplicerad och mycket krävande teknikmässigt. Denna process har mer än 10 steg där ett enda misstag i kedjan kan orsaka ett fullständigt misslyckande i slutprodukten. En felaktig genomföring innebär höga kostnader för företaget både ekonomiskt och motverkar en god image. Nuförtiden är feldetekteringen korrigerande-orienterad, det betyder att det är låg spårbarhet på var problemet uppstår och upptäcks först när slutprodukten testas. Syftet med detta examensarbete är att testa ett maskininlärningsverktyg från Imagimob® för att avgöra om det är möjligt att upptäcka fel under tillverkningsprocessen med hjälp av befintliga insamlade data. För att utföra testet togs ett prov från 2019 där produktionen av genomföringar nådde 60 % skrotmängd. Ett djupt lärande-neuralt nätverk med 2D-konvolutionelt lager implementerades. Det slutliga resultatet av systemet visade en effektivitet på 80 %. På grund av komplexiteten i tillverkningsprocessen för genomföringarna, de få datapunkterna och tillägget av olika faktorer som kan resultera i en felaktig genomföring, ställs ett sannolikhetsområde in beroende på antalet upptäckta avvikelser. Med en sådan validering kan verktyget markera 18 % av genomföringarna som säkert felaktiga och 27 % som troligen felaktiga. Begränsningen med verktyget är att informationen måste analyseras efter att varje steg är gjort, och inte kontinuerligt, därför bör ytterligare forskning göras för att implementera ett realtidsverktyg. Machine Learning anomaly detection manufacturing process. Maskininlärning avvikelsedetektering tillverkningsprocess. Engineering and Technology Teknik och teknologier
154	Deep Quantile Regression for Unsupervised Anomaly Detection in Time-Series Tambuwal, Ahmad I., Neagu, Daniel 18 November 2021 (has links) Yes / Time-series anomaly detection receives increasing research interest given the growing number of data-rich application domains. Recent additions to anomaly detection methods in research literature include deep neural networks (DNNs: e.g., RNN, CNN, and Autoencoder). The nature and performance of these algorithms in sequence analysis enable them to learn hierarchical discriminative features and time-series temporal nature. However, their performance is affected by usually assuming a Gaussian distribution on the prediction error, which is either ranked, or threshold to label data instances as anomalous or not. An exact parametric distribution is often not directly relevant in many applications though. This will potentially produce faulty decisions from false anomaly predictions due to high variations in data interpretation. The expectations are to produce outputs characterized by a level of confidence. Thus, implementations need the Prediction Interval (PI) that quantify the level of uncertainty associated with the DNN point forecasts, which helps in making better-informed decision and mitigates against false anomaly alerts. An effort has been made in reducing false anomaly alerts through the use of quantile regression for identification of anomalies, but it is limited to the use of quantile interval to identify uncertainties in the data. In this paper, an improve time-series anomaly detection method called deep quantile regression anomaly detection (DQR-AD) is proposed. The proposed method go further to used quantile interval (QI) as anomaly score and compare it with threshold to identify anomalous points in time-series data. The tests run of the proposed method on publicly available anomaly benchmark datasets demonstrate its effective performance over other methods that assumed Gaussian distribution on the prediction or reconstruction cost for detection of anomalies. This shows that our method is potentially less sensitive to data distribution than existing approaches. / Petroleum Technology Development Fund (PTDF) PhD Scholarship, Nigeria (Award Number: PTDF/ ED/PHD/IAT/884/16) Time-series Anomaly detection Prediction interval Deep neural networks Long short-term memory Quantile regression
155	Anomaly Detection for Control Centers Gyamfi, Cliff Oduro 06 1900 (has links) The control center is a critical location in the power system infrastructure. Decisions regarding the power system’s operation and control are often made from the control center. These control actions are made possible through SCADA communication. This capability however makes the power system vulnerable to cyber attacks. Most of the decisions taken by the control center dwell on the measurement data received from substations. These measurements estimate the state of the power grid. Measurement-based cyber attacks have been well studied to be a major threat to control center operations. Stealthy false data injection attacks are known to evade bad data detection. Due to the limitations with bad data detection at the control center, a lot of approaches have been explored especially in the cyber layer to detect measurement-based attacks. Though helpful, these approaches do not look at the physical layer. This study proposes an anomaly detection system for the control center that operates on the laws of physics. The system also identifies the specific falsified measurement and proposes its estimated measurement value. / United States Department of Energy (DOE) National Renewable Energy Laboratory (NREL) / Master of Science / Electricity is an essential need for human life. The power grid is one of the most important human inventions that fueled other technological innovations in the industrial revolution. Changing demands in usage have added to its operational complexity. Several modifications have been made to the power grid since its invention to make it robust and operationally safe. Integration of ICT has significantly improved the monitoring and operability of the power grid. Improvements through ICT have also exposed the power grid to cyber vulnerabilities. Since the power system is a critical infrastructure, there is a growing need to keep it secure and operable for the long run. The control center of the power system serves mainly as the decision-making hub of the grid. It operates through a communication link with the various dispersed devices and substations on the grid. This interconnection makes remote control and monitoring decisions possible from the control center. Data from the substations through the control center are also used in electricity markets and economic dispatch. The control center is however susceptible to cyber-attacks, particularly measurement-based attacks. When attackers launch measurement attacks, their goal is to force control actions from the control center that can make the system unstable. They make use of the vulnerabilities in the cyber layer to launch these attacks. They can inject falsified data packets through this link to usurp correct ones upon arrival at the control center. This study looks at an anomaly detection system that can detect falsified measurements at the control center. It will also indicate the specific falsified measurements and provide an estimated value for further analysis. Measurement-Based Cyber Attacks State Estimation Anomaly Detection Energy Management System Control Center
156	Network Anomaly Detection with Incomplete Audit Data Patcha, Animesh 04 October 2006 (has links) With the ever increasing deployment and usage of gigabit networks, traditional network anomaly detection based intrusion detection systems have not scaled accordingly. Most, if not all, systems deployed assume the availability of complete and clean data for the purpose of intrusion detection. We contend that this assumption is not valid. Factors like noise in the audit data, mobility of the nodes, and the large amount of data generated by the network make it difficult to build a normal traffic profile of the network for the purpose of anomaly detection. From this perspective, the leitmotif of the research effort described in this dissertation is the design of a novel intrusion detection system that has the capability to detect intrusions with high accuracy even when complete audit data is not available. In this dissertation, we take a holistic approach to anomaly detection to address the threats posed by network based denial-of-service attacks by proposing improvements in every step of the intrusion detection process. At the data collection phase, we have implemented an adaptive sampling scheme that intelligently samples incoming network data to reduce the volume of traffic sampled, while maintaining the intrinsic characteristics of the network traffic. A Bloom filters based fast flow aggregation scheme is employed at the data pre-processing stage to further reduce the response time of the anomaly detection scheme. Lastly, this dissertation also proposes an expectation-maximization algorithm based anomaly detection scheme that uses the sampled audit data to detect intrusions in the incoming network traffic. / Ph. D. high speed networks Anomaly detection weighted sampling denial-of-service expectation-maximization
157	Evaluation of Scan Methods Used in the Monitoring of Public Health Surveillance Data Fraker, Shannon E. 07 December 2007 (has links) With the recent increase in the threat of biological terrorism as well as the continual risk of other diseases, the research in public health surveillance and disease monitoring has grown tremendously. There is an abundance of data available in all sorts of forms. Hospitals, federal and local governments, and industries are all collecting data and developing new methods to be used in the detection of anomalies. Many of these methods are developed, applied to a real data set, and incorporated into software. This research, however, takes a different view of the evaluation of these methods. We feel that there needs to be solid statistical evaluation of proposed methods no matter the intended area of application. Using proof-by-example does not seem reasonable as the sole evaluation criteria especially concerning methods that have the potential to have a great impact in our lives. For this reason, this research focuses on determining the properties of some of the most common anomaly detection methods. A distinction is made between metrics used for retrospective historical monitoring and those used for prospective on-going monitoring with the focus on the latter situation. Metrics such as the recurrence interval and time-to-signal measures are therefore the most applicable. These metrics, in conjunction with control charts such as exponentially weighted moving average (EWMA) charts and cumulative sum (CUSUM) charts, are examined. Two new time-to-signal measures, the average time-between-signal events and the average signal event length, are introduced to better compare the recurrence interval with the time-to-signal properties of surveillance schemes. The relationship commonly thought to exist between the recurrence interval and the average time to signal is shown to not exist once autocorrelation is present in the statistics used for monitoring. This means that closer consideration needs to be paid to the selection of which of these metrics to report. The properties of a commonly applied scan method are also studied carefully in the strictly temporal setting. The counts of incidences are assumed to occur independently over time and follow a Poisson distribution. Simulations are used to evaluate the method under changes in various parameters. In addition, there are two methods proposed in the literature for the calculation of the p-value, an adjustment based on the tests for previous time periods and the use of the recurrence interval with no adjustment for previous tests. The difference in these two methods is also considered. The quickness of the scan method in detecting an increase in the incidence rate as well as the number of false alarm events that occur and how long the method signals after the increase threat has passed are all of interest. These estimates from the scan method are compared to other attribute monitoring methods, mainly the Poisson CUSUM chart. It is shown that the Poisson CUSUM chart is typically faster in the detection of the increased incidence rate. / Ph. D. Recurrence Interval EWMA charts Anomaly detection CUSUM charts Time-to-Signal Scan Method
158	Anomaly crowd movement detection using machinelearning techniques Longberg, Victor January 2024 (has links) This master’s thesis investigates the application of anomaly detection techniques to analyze crowdmovements using cell location data, a topic of growing interest in public safety and policymaking. Thisresearch uses machine learning algorithms, specifically Isolation Forest and DBSCAN, to identify unusualmovement patterns within a large, unlabeled dataset. The study addresses the challenges inherent inprocessing and analyzing vast amounts of spatial and temporal data through a comprehensive method-ology that includes data preprocessing, feature engineering, and optimizing algorithm parameters. Thefindings highlight the feasibility of employing anomaly detection in real-world scenarios, demonstratingthe algorithms’ ability to detect anomalies and offering insights into crowd dynamics. Artificial Intelligence AI machine learning anomaly detection crowd movement Engineering and Technology Teknik och teknologier Computer Engineering Datorteknik
159	Finding Interesting Subgraphs with Guarantees Cadena, Jose 29 January 2018 (has links) Networks are a mathematical abstraction of the interactions between a set of entities, with extensive applications in social science, epidemiology, bioinformatics, and cybersecurity, among others. There are many fundamental problems when analyzing network data, such as anomaly detection, dense subgraph mining, motif finding, information diffusion, and epidemic spread. A common underlying task in all these problems is finding an "interesting subgraph"; that is, finding a part of the graph---usually small relative to the whole---that optimizes a score function and has some property of interest, such as connectivity or a minimum density. Finding subgraphs that satisfy common constraints of interest, such as the ones above, is computationally hard in general, and state-of-the-art algorithms for many problems in network analysis are heuristic in nature. These methods are fast and usually easy to implement. However, they come with no theoretical guarantees on the quality of the solution, which makes it difficult to assess how the discovered subgraphs compare to an optimal solution, which in turn affects the data mining task at hand. For instance, in anomaly detection, solutions with low anomaly score lead to sub-optimal detection power. On the other end of the spectrum, there have been significant advances on approximation algorithms for these challenging graph problems in the theoretical computer science community. However, these algorithms tend to be slow, difficult to implement, and they do not scale to the large datasets that are common nowadays. The goal of this dissertation is developing scalable algorithms with theoretical guarantees for various network analysis problems, where the underlying task is to find subgraphs with constraints. We find interesting subgraphs with guarantees by adapting techniques from parameterized complexity, convex optimization, and submodularity optimization. These techniques are well-known in the algorithm design literature, but they lead to slow and impractical algorithms. One unifying theme in the problems that we study is that our methods are scalable without sacrificing the theoretical guarantees of these algorithm design techniques. We accomplish this combination of scalability and rigorous bounds by exploiting properties of the problems we are trying to optimize, decomposing or compressing the input graph to a manageable size, and parallelization. We consider problems on network analysis for both static and dynamic network models. And we illustrate the power of our methods in applications, such as public health, sensor data analysis, and event detection using social media data. / Ph. D. Graph Mining Data Mining Graph Algorithms Anomaly Detection Finding Subgraphs Parameterized Complexity Distributed Algorithms
160	On the Effectiveness of Dimensionality Reduction for Unsupervised Structural Health Monitoring Anomaly Detection Soleimani-Babakamali, Mohammad Hesam 19 April 2022 (has links) Dimensionality reduction techniques (DR) enhance data interpretability and reduce space complexity, though at the cost of information loss. Such methods have been prevalent in the Structural Health Monitoring (SHM) anomaly detection literature. While DR is favorable in supervised anomaly detection, where possible novelties are known a priori, the efficacy is less clear in unsupervised detection. In this work, we perform a detailed assessment of the DR performance trade-offs to determine whether the information loss imposed by DR can impact SHM performance for previously unseen novelties. As a basis for our analysis, we rely on an SHM anomaly detection method operating on input signals' fast Fourier transform (FFT). FFT is regarded as a raw, frequency-domain feature that allows studying various DR techniques. We design extensive experiments comparing various DR techniques, including neural autoencoder models, to capture the impact on two SHM benchmark datasets exclusively. Results imply the loss of information to be more detrimental, reducing the novelty detection accuracy by up to 60\% with autoencoder-based DR. Regularization can alleviate some of the challenges though unpredictable. Dimensions of substantial vibrational information mostly survive DR; thus, the regularization impact suggests that these dimensions are not reliable damage-sensitive features regarding unseen faults. Consequently, we argue that designing new SHM anomaly detection methods that can work with high-dimensional raw features is a necessary research direction and present open challenges and future directions. / M.S. / Structural health monitoring (SHM) aids the timely maintenance of infrastructures, saving human lives and natural resources. Infrastructure will undergo unseen damages in the future. Thus, data-driven SHM techniques for handling unlabeled data (i.e., unsupervised learning) are suitable for real-world usage. Lacking labels and defined data classes, data instances are categorized through similarities, i.e., distances. Still, distance metrics in high-dimensional spaces can become meaningless. As a result, applying methods to reduce data dimensions is currently practiced, yet, at the cost of information loss. Naturally, a trade-off exists between the loss of information and the increased interpretability of low-dimensional spaces induced by dimensionality reduction procedures. This study proposes an unsupervised SHM technique that works with low and high-dimensional data to assess that trade-off. Results show the negative impacts of dimensionality reduction to be more severe than its benefits. Developing unsupervised SHM methods with raw data is thus encouraged for real-world applications. Unsupervised SHM Generative Adversarial Networks Anomaly Detection Dimensionality Reduction Autoencoder Regularization

Search results