Global ETD Search

261	Advanced Algorithms for Classification and Anomaly Detection on Log File Data : Comparative study of different Machine Learning Approaches Wessman, Filip January 2021 (has links) Background: A problematic area in today’s large scale distributed systems is the exponential amount of growing log data. Finding anomalies by observing and monitoring this data with manual human inspection methods becomes progressively more challenging, complex and time consuming. This is vital for making these systems available around-the-clock. Aim: The main objective of this study is to determine which are the most suitable Machine Learning (ML) algorithms and if they can live up to needs and requirements regarding optimization and efficiency in the log data monitoring area. Including what specific steps of the overall problem can be improved by using these algorithms for anomaly detection and classification on different real provided data logs. Approach: Initial pre-study is conducted, logs are collected and then preprocessed with log parsing tool Drain and regular expressions. The approach consisted of a combination of K-Means + XGBoost and respectively Principal Component Analysis (PCA) + K-Means + XGBoost. These was trained, tested and with different metrics individually evaluated against two datasets, one being a Server data log and on a HTTP Access log. Results: The results showed that both approaches performed very well on both datasets. Able to with high accuracy, precision and low calculation time classify, detect and make predictions on log data events. It was further shown that when applied without dimensionality reduction, PCA, results of the prediction model is slightly better, by a few percent. As for the prediction time, there was marginally small to no difference for when comparing the prediction time with and without PCA. Conclusions: Overall there are very small differences when comparing the results for with and without PCA. But in essence, it is better to do not use PCA and instead apply the original data on the ML models. The models performance is generally very dependent on the data being applied, it the initial preprocessing steps, size and it is structure, especially affecting the calculation time the most. Machine Learning (ML) K-Means Principal Component Analysis (PCA) XGBoost Log data Anomaly Detection Outlier Detection Clustering. Computer Engineering Datorteknik
262	Detection of Fat-Water Inversions in MRI Data With Deep Learning Methods Hellgren, Lisa, Asketun, Fanny January 2021 (has links) Magnetic resonance imaging (MRI) is a widely used medical imaging technique for examinations of the body. However, artifacts are a common problem, that must be handled for reliable diagnoses and to avoid drawing inaccurate conclusions about the contextual insights. Magnetic resonance (MR) images acquired with a Dixon-sequence enables two channels with separate fat and water content. Fat-water inversions, also called swaps, are one common artifact with this method where voxels from the two channels are swapped, producing incorrect data. This thesis investigates the possibility to use deep learning methods for an automatic detection of swaps in MR volumes. The data used in this thesis are MR volumes from UK Biobank, processed by AMRA Medical. Segmentation masks of complicated swaps are created by operators who manually annotate the swap, but only if the regions affect subsequent measurements. The segmentation masks are therefore not fully reliable, and additional synthesized swaps were created. Two different deep learning approaches were investigated, a reconstruction-based method and a segmentation-based method. The reconstruction-based networks were trained to reconstruct a volume as similar as possible to the input volume without any swaps. When testing the network on a volume with a swap, the location of the swap can be estimated from the reconstructed volume with postprocessing methods. Autoencoders is an example of a reconstruction-based network. The segmentation-based models were trained to segment a swap directly from the input volume, thus using volumes with swaps both during training and testing. The segmentation-based networks were inspired by a U-Net. The performance of the models from both approaches was evaluated on data with real and synthetic swaps with the metrics: Dice coefficient, precision, and recall. The result shows that the reconstruction-based models are not suitable for swap detection. Difficulties in finding the right architecture for the models resulted in bad reconstructions, giving unreliable predictions. Further investigations in different post-processing methods, architectures, and hyperparameters might improve swap detection. The segmentation-based models are robust with reliable detections independent of the size of the swaps, despite being trained on data with synthesized swaps. The results from the models look very promising, and can probably be used as an automated method for swap detection with some further fine-tuning of the parameters. mr magnetic resonance machine learning deep learning anomaly detection U-Net autoencoder Medical Image Processing Medicinsk bildbehandling
263	Detekce síťových anomálií na základě NetFlow dat / Detection of Network Anomalies Based on NetFlow Data Czudek, Marek January 2013 (has links) This thesis describes the use of NetFlow data in the systems for detection of disruptions or anomalies in computer network traffic. Various methods for network data collection are described, focusing especially on the NetFlow protocol. Further, various methods for anomaly detection in network traffic are discussed and evaluated, and their advantages as well as disadvantages are listed. Based on this analysis one method is chosen. Further, test data set is analyzed using the method. Algorithm for real-time network traffic anomaly detection is designed based on the analysis outcomes. This method was chosen mainly because it enables detection of anomalies even in an unlabelled network traffic. The last part of the thesis describes implementation of the algorithm, as well as experiments performed using the resulting application on real NetFlow data.
264	Investigation of Machine Learning Methods for Anomaly Detection and Characterisation of Cable Shoe Pressing Processes Härenby Deak, Elliot January 2021 (has links) The ability to reliably connect electrical cables is important in many applications. A poor connection can become a fire hazard, so it is important that cables are always appropriately connected. This thesis investigates methods for monitoring of a machine that presses cable connectors onto cables. Using sensor data from the machine, would it be possible to create an algorithm that can automatically identify the cable and connector and thus make decisions on how a connector should be pressed for successful attachment? Furthermore, would it be possible to create an anomaly detection algorithm that is able to detect whether a connector has been incorrectly pressed by the end user? If these two questions can be addressed, the solutions would minimise the likelihood of errors, and enable detection of errors that anyway do arise. In this thesis, it is shown that the k-Nearest Neighbour (kNN) algorithm and Long Short-Term Memory (LSTM) network are both successful in classification of connectors and cables, both performing with 100% accuracy on the test set. The LSTM is the more promising alternative in terms of convergence and speed, being 28 times faster as well as requiring less memory. The distance-based methods and an autoencoder are investigated for the anomaly detection task. Data corresponding to a wide variety of possible incorrect kinds of usage of the tool were collected. The best anomaly detector detects 92% of incorrect cases of varying degrees of difficulty, a number which was higher than expected. On the tasks investigated, the performance of the neural networks are equal to or higher than the performance of the alternative methods. Machine Learning LSTM kNN Time-series Classification Anomaly Detection Annan elektroteknik och elektronik
265	Deep Learning Fault Protection Applied to Spacecraft Attitude Determination and Control Justin Mansell (9175307) 30 July 2020 (has links) The increasing numbers and complexity of spacecraft is driving a growing need for automated fault detection, isolation, and recovery. Anomalies and failures are common occurrences during space flight operations, yet most spacecraft currently possess limited ability to detect them, diagnose their underlying cause, and enact an appropriate response. This leaves ground operators to interpret extensive telemetry and resolve faults manually, something that is impractical for large constellations of satellites and difficult to do in a timely fashion for missions in deep space. A traditional hurdle for achieving autonomy has been that effective fault detection, isolation, and recovery requires appreciating the wider context of telemetry information. Advances in machine learning are finally allowing computers to succeed at such tasks. This dissertation presents an architecture based on machine learning for detecting, diagnosing, and responding to faults in a spacecraft attitude determination and control system. Unlike previous approaches, the availability of faulty examples is not assumed. In the first level of the system, one-class support vector machines are trained from nominal data to flag anomalies in telemetry. Meanwhile, a spacecraft simulator is used to model the activation of anomaly flags under different fault conditions and train a long short-term memory neural network to convert time-dependent anomaly information into a diagnosis. Decision theory is then used to convert diagnoses into a recovery action. The overall technique is successfully validated on data from the LightSail 2 mission. <br> Aerospace Engineering Fault detection Fault diagnosis Fault recovery Attitude determination Attitude control LightSail 2 Machine learning Anomaly detection
266	Toward Resilience in High Performance Computing:: A Prototype to Analyze and Predict System Behavior Ghiasvand, Siavash 16 October 2020 (has links) Following the growth of high performance computing systems (HPC) in size and complexity, and the advent of faster and more complex Exascale systems, failures became the norm rather than the exception. Hence, the protection mechanisms need to be improved. The most de facto mechanisms such as checkpoint/restart or redundancy may also fail to support the continuous operation of future HPC systems in the presence of failures. Failure prediction is a new protection approach that is beneficial for HPC systems with a short mean time between failure. The failure prediction mechanism extends the existing protection mechanisms via the dynamic adjustment of the protection level. This work provides a prototype to analyze and predict system behavior using statistical analysis to pave the path toward resilience in HPC systems. The proposed anomaly detection method is noise-tolerant by design and produces accurate results with as little as 30 minutes of historical data. Machine learning models complement the main approach and further improve the accuracy of failure predictions up to 85%. The fully automatic unsupervised behavior analysis approach, proposed in this work, is a novel solution to protect future extreme-scale systems against failures.:1 Introduction 1.1 Background and Statement of the Problem 1.2 Purpose and Significance of the Study 1.3 Jam–e Jam: A System Behavior Analyzer 2 Review of the Literature 2.1 Syslog Analysis 2.2 Users and Systems Privacy 2.3 Failure Detection and Prediction 2.3.1 Failure Correlation 2.3.2 Anomaly Detection 2.3.3 Prediction Methods 2.3.4 Prediction Accuracy and Lead Time 3 Data Collection and Preparation 3.1 Taurus HPC Cluster 3.2 Monitoring Data 3.2.1 Data Collection 3.2.2 Taurus System Log Dataset 3.3 Data Preparation 3.3.1 Users and Systems Privacy 3.3.2 Storage and Size Reduction 3.3.3 Automation and Improvements 3.3.4 Data Discretization and Noise Mitigation 3.3.5 Cleansed Taurus System Log Dataset 3.4 Marking Potential Failures 4 Failure Prediction 4.1 Null Hypothesis 4.2 Failure Correlation 4.2.1 Node Vicinities 4.2.2 Impact of Vicinities 4.3 Anomaly Detection 4.3.1 Statistical Analysis (frequency) 4.3.2 Pattern Detection (order) 4.3.3 Machine Learning 4.4 Adaptive resilience 5 Results 5.1 Taurus System Logs 5.2 System-wide Failure Patterns 5.3 Failure Correlations 5.4 Taurus Failures Statistics 5.5 Jam-e Jam Prototype 5.6 Summary and Discussion 6 Conclusion and Future Works Bibliography List of Figures List of Tables Appendix A Neural Network Models Appendix B External Tools Appendix C Structure of Failure Metadata Databse Appendix D Reproducibility Appendix E Publicly Available HPC Monitoring Datasets Appendix F Glossary Appendix G Acronyms info:eu-repo/classification/ddc/004 ddc:004
267	EVALUATION OF UNSUPERVISED MACHINE LEARNING MODELS FOR ANOMALY DETECTION IN TIME SERIES SENSOR DATA Bracci, Lorenzo, Namazi, Amirhossein January 2021 (has links) With the advancement of the internet of things and the digitization of societies sensor recording time series data can be found in an always increasing number of places including among other proximity sensors on cars, temperature sensors in manufacturing plants and motion sensors inside smart homes. This always increasing reliability of society on these devices lead to a need for detecting unusual behaviour which could be caused by malfunctioning of the sensor or by the detection of an uncommon event. The unusual behaviour mentioned is often referred to as an anomaly. In order to detect anomalous behaviours, advanced technologies combining mathematics and computer science, which are often referred to as under the umbrella of machine learning, are frequently used to solve these problems. In order to help machines to learn valuable patterns often human supervision is needed, which in this case would correspond to use recordings which a person has already classified as anomalies or normal points. It is unfortunately time consuming to label data, especially the large datasets that are created from sensor recordings. Therefore in this thesis techniques that require no supervision are evaluated to perform anomaly detection. Several different machine learning models are trained on different datasets in order to gain a better understanding concerning which techniques perform better when different requirements are important such as presence of a smaller dataset or stricter requirements on inference time. Out of the models evaluated, OCSVM resulted in the best overall performance, achieving an accuracy of 85% and K- means was the fastest model as it took 0.04 milliseconds to run inference on one sample. Furthermore LSTM based models showed most possible improvements with larger datasets. / Med utvecklingen av Sakernas internet och digitaliseringen av samhället kan man registrera tidsseriedata på allt fler platser, bland annat igenom närhetssensorer på bilar, temperatursensorer i tillverkningsanläggningar och rörelsesensorer i smarta hem. Detta ständigt ökande beroende i samhället av dessa enheter leder till ett behov av att upptäcka ovanligt beteende som kan orsakas av funktionsstörning i sensorn eller genom upptäckt av en ovanlig händelse. Det ovanliga beteendet som nämns kallas ofta för en anomali. För att upptäcka avvikande beteenden används avancerad teknik som kombinerar matematik och datavetenskap, som ofta kallas maskininlärning. För att hjälpa maskiner att lära sig värdefulla mönster behövs ofta mänsklig tillsyn, vilket i detta fall skulle motsvara användningsinspelningar som en person redan har klassificerat som avvikelser eller normala punkter. Tyvärr är det tidskrävande att märka data, särskilt de stora datamängder som skapas från sensorinspelningar. Därför utvärderas tekniker som inte kräver någon handledning i denna avhandling för att utföra anomalidetektering. Flera olika maskininlärningsmodeller utbildas på olika datamängder för att få en bättre förståelse för vilka tekniker som fungerar bättre när olika krav är viktiga, t.ex. närvaro av en mindre dataset eller strängare krav på inferens tid. Av de utvärderade modellerna resulterade OCSVM i bästa totala prestanda, uppnådde en noggrannhet på 85% och K- means var den snabbaste modellen eftersom det hade en inferens tid av 0,04 millisekunder. Dessutom visade LSTM- baserade modeller de bästa möjliga förbättringarna med större datamängder. Machine learning Unsupervised learning Anomaly detection Time Series data Maskininlärning Oövervakat Lärande Anomalidetektering tidsseriedata Computer and Information Sciences Data- och informationsvetenskap
268	Learning from 3D generated synthetic data for unsupervised anomaly detection Fröjdholm, Hampus January 2021 (has links) Modern machine learning methods, utilising neural networks, require a lot of training data. Data gathering and preparation has thus become a major bottleneck in the machine learning pipeline and researchers often use large public datasets to conduct their research (such as the ImageNet [1] or MNIST [2] datasets). As these methods begin being used in industry, these challenges become apparent. In factories objects being produced are often unique and may even involve trade secrets and patents that need to be protected. Additionally, manufacturing may not have started yet, making real data collection impossible. In both cases a public dataset is unlikely to be applicable. One possible solution, investigated in this thesis, is synthetic data generation. Synthetic data generation using physically based rendering was tested for unsupervised anomaly detection on a 3D printed block. A small image dataset was gathered of the block as control and a data generation model was created using its CAD model, a resource most often available in industrial settings. The data generation model used randomisation to reduce the domain shift between the real and synthetic data. For testing the data, autoencoder models were trained, both on the real and synthetic data separately and in combination. The material of the block, a white painted surface, proved challenging to reconstruct and no significant difference between the synthetic and real data could be observed. The model trained on real data outperformed the models trained on synthetic and the combined data. However, the synthetic data combined with the real data showed promise with reducing some of the bias intentionally introduced in the real dataset. Future research could focus on creating synthetic data for a problem where a good anomaly detection model already exists, with the goal of transferring some of the synthetic data generation model (such as the materials) to a new problem. This would be of interest in industries where they produce many different but similar objects and could reduce the time needed when starting a new machine learning project. machine learning synthetic data anomaly detection physically based rendering maskininlärning syntetisk data anomalidetektion Computer Sciences Datavetenskap (datalogi)
269	Machine Learning to Detect Anomalies in the Welding Process to Support Additive Manufacturing Dasari, Vinod Kumar January 2021 (has links) Additive Manufacturing (AM) is a fast-growing technology in manufacturing industries. Applications of AM are spread across a wide range of fields. The aerospace industry is one of the industries that use AM because of its ability to produce light-weighted components and design freedom. Since the aerospace industry is conservative, quality control and quality assurance are essential. The quality of the welding is one of the factors that determine the quality of the AM components, hence, detecting faults in the welding is crucial. In this thesis, an automated system for detecting the faults in the welding process is presented. For this, three methods are proposed to find the anomalies in the process. The process videos that contain weld melt-pool behaviour are used in the methods. The three methods are 1) Autoencoder method, 2) Variational Autoencoder method, and 3) Image Classification method. Methods 1 and 2 are implemented using Convolutional-Long Short Term Memory (LSTM) networks to capture anomalies that occur over a span of time. For this, instead of a single image, a sequence of images is used as input to track abnormal behaviour by identifying the dependencies among the images. The method training to detect anomalies is unsupervised. Method 3 is implemented using Convolutional Neural Networks, and it takes a single image as input and predicts the process image as stable or unstable. The method learning is supervised. The results show that among the three models, the Variational Autoencoder model performed best in our case for detecting the anomalies. In addition, it is observed that in methods 1 and 2, the sequence length and frames retrieved per second from process videos has effect on model performance. Furthermore, it is observed that considering the time dependencies in our case is very beneficial as the difference between the anomalous and the non anomalous process is very small Autoencoders Defect detection Anomaly detection Additive manufacturing Welding defect detection
270	Application of Deep-learning Method to Surface Anomaly Detection / Tillämpning av djupinlärningsmetoder för detektering av ytanomalier Le, Jiahui January 2021 (has links) In traditional industrial manufacturing, due to the limitations of science and technology, manual inspection methods are still used to detect product surface defects. This method is slow and inefficient due to manual limitations and backward technology. The aim of this thesis is to research whether it is possible to automate this using modern computer hardware and image classification of defects using different deep learning methods. The report concludes, based on results from controlled experiments, that it is possible to achieve a dice coefficient of more than 81%. Image processing Deep learning Mask R-CNN Surface anomaly detection Surface defect detection Computer Sciences Datavetenskap (datalogi)

Search results