191 |
Machine learning-based performance analytics for high-performance computing systemsAksar, Burak 17 January 2024 (has links)
High-performance Computing (HPC) systems play pivotal roles in societal and scientific advancements, executing up to quintillions of calculations every second. As we shift towards exascale computing and beyond, modern HPC systems emphasize resource sharing, where various applications share processors, memory, networks, and other components. While this sharing enhances power efficiency, it complicates performance prediction and introduces significant variations in application running times, affecting overall system efficiency and operational costs.
HPC systems utilize monitoring frameworks that gather numerical telemetry data on resource usage to track operational status. Given the massive complexity and volume of this data, manual analysis is often daunting and inefficient. Machine learning (ML) techniques offer automated performance anomaly diagnosis, but the transition from successful research outcomes to production-scale deployment encounters two critical obstacles. First, the scarcity of labeled training data (i.e., identifying healthy and anomalous runs) in telemetry datasets makes it hard to train these ML systems effectively. Second, runtime analysis, required for providing timely detection and diagnosis of performance anomalies, demands seamless integration of ML-based methods with the monitoring frameworks.
This thesis claims that ML-based performance analytics frameworks that leverage a limited amount of labeled data and ensure runtime analysis can achieve sufficient anomaly diagnosis performance for production HPC systems. To support this claim, we undertake ML-based performance analytics on two fronts. First, we design and develop novel frameworks for anomaly diagnosis that leverage semi-supervised or unsupervised learning techniques to reduce the need for extensive labeled data. Second, we design a simple yet adaptable architecture to enable deployment and demonstrate that these frameworks are feasible for runtime analysis.
This thesis makes the following specific contributions: First, we design a semi-supervised anomaly diagnosis framework, Proctor, which operates with hundreds of labeled samples (in contrast to tens of thousands) and a vast number of unlabeled samples. We show that Proctor outperforms the fully supervised baseline by up to 11% in F1-score for diagnosing anomalies when there are approximately 30 labeled samples. We then reframe the problem and introduce ALBADRoss to determine which samples should be labeled by experts to maximize the model performance using active learning. On a production HPC dataset, ALBADRoss achieves a 0.95 F1-score (the same score that a fully-supervised framework achieved) and near-zero false alarm rate using 24x fewer labeled samples. Finally, with Prodigy, we solve the anomaly detection problem but with a focus on deployment. Prodigy is designed for detecting performance anomalies on compute nodes using unsupervised learning. Our framework achieves a 0.95 F1-score in detecting anomalies on a production HPC system telemetry dataset. We also design a simple and adaptable software architecture and deploy it on a 1488-node production HPC system, detecting real-world performance anomalies with 88% accuracy.
|
192 |
Some new anomaly detection methods with applications to financial dataZhao, Zhicong 06 August 2021 (has links)
Novel clustering methods are presented and applied to financial data. First, a scan-statistics method for detecting price point clusters in financial transaction data is considered. The method is applied to Electronic Business Transfer (EBT) transaction data of the Supplemental Nutrition Assistance Program (SNAP). For a given vendor, transaction amounts are fit via maximum likelihood estimation which are then converted to the unit interval via a natural copula transformation. Next, a new Markov type relation for order statistics on the unit interval is developed. The relation is used to characterize the distribution of the minimum exceedance of all copula transformed transaction amounts above an observed order statistic. Conditional on observed order statistics, independent and asymptotically identical indicator functions are constructed and the success probably as a function of the gaps in consecutive order statistics is specified. The success probabilities are shown to be a function of the hazard rate of the transformed transaction distribution. If gaps are smaller than expected, then the corresponding indicator functions are more likely to be one. A scan statistic is then applied to the sequence of indicator functions to detect locations where too many gaps are smaller than expected. These sets of gaps are then flagged as being anomalous price point clusters. It is noted that prominent price point clusters appearing in the data may be a historical vestige of previous versions of the SNAP program involving outdated paper "food stamps". The second part of the project develops a novel clustering method whereby the time series of daily total EBT transaction amounts are clustered by periodicity. The schemeworks by normalizing the time series of daily total transaction amounts for two distinct vendors and taking daily differences in those two series. The difference series is then examined for periodicity via a novel F statistic. We find one may cluster the monthly periodicities of vendors by type of store using the F statistic, a proxy for a distance metric. This may indicate that spending preferences for SNAP benefit recipients varies by day of the month, however, this opens further questions about potential forcing mechanisms and the apparent changing appetites for spending.
|
193 |
Identifying the Impact of Noise on Anomaly Detection through Functional Near-Infrared Spectroscopy (fNIRS) and Eye-trackingGabbard, Ryan Dwight 11 August 2017 (has links)
No description available.
|
194 |
Performance of One-class Support Vector Machine (SVM) in Detection of Anomalies in the Bridge DataDalvi, Aditi January 2017 (has links)
No description available.
|
195 |
Anomaly Detection and Microstructure Characterization in Fiber Reinforced Ceramic Matrix CompositesBricker, Stephen January 2015 (has links)
No description available.
|
196 |
Approaches to Abnormality Detection with ConstraintsOtey, Matthew Eric 12 September 2006 (has links)
No description available.
|
197 |
Topology-aware Correlated Network Anomaly Detection and DiagnosisDhanapalan, Manojprasadh 19 July 2012 (has links)
No description available.
|
198 |
Software Performance Anomaly Detection Through Analysis Of Test Data By Multivariate TechniquesSalahshour Torshizi, Sara January 2022 (has links)
This thesis aims to uncover anomalies in the data describing the performance behavior of a "robot controller" as measured by software metrics. The purpose of analyzing data is mainly to identify the changes that have resulted in different performance behaviors which we refer to as performance anomalies. To address this issue, two separate pre-processing approaches have been developed: one that adds the principal component to the data after cleaning steps and another that does not regard the principal component. Next, Isolation Forest is employed, which uses an ensemble of isolation trees for data points to segregate anomalies and generate scores that can be used to discover anomalies. Further, in order to detect anomalies, the highest distances matching cluster centroids are employed in the clustering procedure. These two data preparation methods, along with two anomaly detection algorithms, identified software builds that are very likely to be anomalies. According to an industrial evaluation conducted based on engineers’ domain knowledge, around 70% of the detected software builds as anomalous builds were successfully identified, indicating system variable deviations or software bugs.
|
199 |
Anomaly or not Anomaly, that is the Question of Uncertainty : Investigating the relation between model uncertainty and anomalies using a recurrent autoencoder approach to market time seriesVidmark, Anton January 2022 (has links)
Knowing when one does not know is crucial in decision making. By estimating uncertainties humans can recognize novelty both by intuition and reason, but most AI systems lack this self-reflective ability. In anomaly detection, a common approach is to train a model to learn the distinction between some notion of normal and some notion of anomalies. In contrast, we let the models build their own notion of normal by learning directly from the data in a self-supervised manner, and by introducing estimations of model uncertainty the models can recognize themselves when novel situations are encountered. In our work, the aim is to investigate the relationship between model uncertainty and anomalies in time series data. We develop a method based on a recurrent autoencoder approach, and we design an anomaly score function that aggregates model error with model uncertainty to indicate anomalies. Use the Monte Carlo Dropout as Bayesian approximation to derive model uncertainty. Asa proof of concept we evaluate our method qualitatively on real-world complex time series using stock market data. Results show that our method can identify extreme events in the stock market. We conclude that the relation between model uncertainty and anomalies can be utilized for anomaly detection in time series data.
|
200 |
Anomaly detection in user behavior of websites using Hierarchical Temporal Memories : Using Machine Learning to detect unusual behavior from users of a web service to quickly detect possible security hazards.Berger, Victor January 2017 (has links)
This Master's Thesis focuses on the recent Cortical Learn-ing Algorithm (CLA), designed for temporal anomaly detection. It is here applied to the problem of anomaly detec-tion in user behavior of web services, which is getting moreand more important in a network security context. CLA is here compared to more traditional state-of-the-art algorithms of anomaly detection: Hidden Markov Models (HMMs) and t-stide (an N-gram-based anomaly detector), which are among the few algorithms compatible withthe online processing constraint of this problem. It is observed that on the synthetic dataset used forthis comparison, CLA performs signicantly better thanthe other two algorithms in terms of precision of the detection. The two other algorithms don't seem to be able tohandle this task at all. It appears that this anomaly de-tection problem (outlier detection in short sequences overa large alphabet) is considerably different from what hasbeen extensively studied up to now.
|
Page generated in 0.0923 seconds