Spelling suggestions: "subject:"outlier"" "subject:"utlier""
31 |
Unsupervised Anomaly Detection in Numerical DatasetsJoshi, Vineet 05 June 2015 (has links)
No description available.
|
32 |
Bayesian Model Averaging and Variable Selection in Multivariate Ecological ModelsLipkovich, Ilya A. 22 April 2002 (has links)
Bayesian Model Averaging (BMA) is a new area in modern applied statistics that provides data analysts with an efficient tool for discovering promising models and obtaining esti-mates of their posterior probabilities via Markov chain Monte Carlo (MCMC). These probabilities can be further used as weights for model averaged predictions and estimates of the parameters of interest. As a result, variance components due to model selection are estimated and accounted for, contrary to the practice of conventional data analysis (such as, for example, stepwise model selection). In addition, variable activation probabilities can be obtained for each variable of interest. This dissertation is aimed at connecting BMA and various ramifications of the multivari-ate technique called Reduced-Rank Regression (RRR). In particular, we are concerned with Canonical Correspondence Analysis (CCA) in ecological applications where the data are represented by a site by species abundance matrix with site-specific covariates. Our goal is to incorporate the multivariate techniques, such as Redundancy Analysis and Ca-nonical Correspondence Analysis into the general machinery of BMA, taking into account such complicating phenomena as outliers and clustering of observations within a single data-analysis strategy. Traditional implementations of model averaging are concerned with selection of variables. We extend the methodology of BMA to selection of subgroups of observations and im-plement several approaches to cluster and outlier analysis in the context of the multivari-ate regression model. The proposed algorithm of cluster analysis can accommodate re-strictions on the resulting partition of observations when some of them form sub-clusters that have to be preserved when larger clusters are formed. / Ph. D.
|
33 |
A Curriculum Guide for Integrating Literary Theory into Twelfth Grade Florida english Language ArtsPhilpot, Helen 01 January 2007 (has links)
Providing high school students a course of study for becoming competent and thorough lifelong independent readers of complex texts was the goal for this thesis. This is accomplished by integrating literary theory that looks beyond just the typical level of analysis often emphasized in many Florida classrooms. If put into use and successful, this curriculum guide will aid Florida teachers in endowing their students with a new level of ability to analyze literature. Research of prior work done in the field of integrating critical theory into high school classrooms was analyzed and synthesized in order to create a larger course of critical theory study to be completed during the senior year of high school in the state of Florida. The curriculum guide acts as a starting point, providing teachers with all the tools necessary to bring literary theory into the high school classroom while maintaining their individual teaching style. The curriculum guide is broken into four distinct units which follow the most common course of Florida twelfth grade study, the English canon, with each chapter addressing two literary theories. The literary theories utilized are: New Criticism, New Historicism, Feminism, Marxism, Reader Response, Psychoanalysis, Structuralism, and Deconstruction.
|
34 |
Measuring Interestingness in Outliers with Explanation Facility using Belief NetworksMasood, Adnan 01 January 2014 (has links)
This research explores the potential of improving the explainability of outliers using Bayesian Belief Networks as background knowledge. Outliers are deviations from the usual trends of data. Mining outliers may help discover potential anomalies and fraudulent activities. Meaningful outliers can be retrieved and analyzed by using domain knowledge. Domain knowledge (or background knowledge) is represented using probabilistic graphical models such as Bayesian belief networks. Bayesian networks are graph-based representation used to model and encode mutual relationships between entities. Due to their probabilistic graphical nature, Belief Networks are an ideal way to capture the sensitivity, causal inference, uncertainty and background knowledge in real world data sets. Bayesian Networks effectively present the causal relationships between different entities (nodes) using conditional probability. This probabilistic relationship shows the degree of belief between entities. A quantitative measure which computes changes in this degree of belief acts as a sensitivity measure .
The first contribution of this research is enhancing the performance for measurement of sensitivity based on earlier research work, the Interestingness Filtering Engine Miner algorithm. The algorithm developed (IBOX - Interestingness based Bayesian outlier eXplainer) provides progressive improvement in the performance and sensitivity scoring of earlier works. Earlier approaches compute sensitivity by measuring divergence among conditional probability of training and test data, while using only couple of probabilistic interestingness measures such as Mutual information and Support to calculate belief sensitivity. With ingrained support from the literature as well as quantitative evidence, IBOX provides a framework to use multiple interestingness measures resulting in better performance and improved sensitivity analysis. The results provide improved performance, and therefore explainability of rare class entities. This research quantitatively validated probabilistic interestingness measures as an effective sensitivity analysis technique in rare class mining. This results in a novel, original, and progressive research contribution to the areas of probabilistic graphical models and outlier analysis.
|
35 |
Automated Outlier Detection for Credit Risk KPI Time Series in E-commerce : A Case Study on the Business Value and Obstacles of Automated Outlier Detection / Automatiserad Outlier Detection för Kreditrisk KPI Tidsserier i E-handelLindberg, Jennifer January 2022 (has links)
E-commerce has grown significantly the last decade, and made a considerable leap during Covid19. The final step in e-commerce is payments, and as a result of this, credit risk management in real-time has become increasingly important. An imperative function in credit risk management is underwriting, in which it is decided which purchases to accept and which not to. However, events can occur that cause increases or decreases in for instance acceptance rates, and these must be detected in order to for instance maintain good stakeholder relationships. Thus, KPI:s are monitored with the aim of detecting outliers as soon as possible. The purpose of this study is to explore the business value and obstacles of automating outlier detection for credit risk KPI time series in e-commerce. In addition, aspects to think about on implementation are investigated. The research is a case study and is founded in thematic analysis of qualitative data collected at an e-commerce company. The results of the study show that automation can contribute to significant business value due to for instance a decrease in monetary and alternative costs of manual monitoring, as well as a potential for better quality in the monitoring, and thus also enhanced stakeholder relationships. However, results also imply that there are several obstacles to actually implementing full automation such as a lack of trust in the automation, along with opinions that automation will impair knowledge and communication, and that the implementation is complex. / Under det senaste årtiondet har e-handel signifikant växt, och under Covid19 eskalerade utvecklingen ännu mer. Det sista steget i e-handel är betalningar, och till följd av detta har kreditriskhantering blivit allt viktigare. En signifikant funktion i kreditriskhantering är underwriting, där det bestäms vilka köp som skall accepteras och inte. Dock kan händelser ske som ökar eller minskar till exempel andelen köp som accepteras, och dessa händelser måste identifieras bland annat för att kunna upprätthålla goda relationer med företagets intressenter. Således monitoreras KPI:er med syftet att upptäcka anomalier så tidigt som möjligt. Syftet med denna studie är att undersöka affärsvärdet, samt barriärer, av implementation av automatiserad outlier detection för kreditrisk KPI tidsserier i e-handel. Denna forskning är en fallstudie som grundas i tematisk analys av kvalitativ data som samlas in på ett e-handelsföretag. Vidare visar resultaten av studien att automatisering kan bidra till betydande affärsvärde bland annat till följd av minskade monetära såväl som alternativa kostnader från manuell monitorering, samt potential till bättre kvalitet i monitoreringen och således förbättrade intressentrelationer. Dock tyder resultaten även på att det finns ett flertal hinder för att faktiskt implementera full automatisering såsom brist på tillit till automatisering, tillsammans med åsikter såsom att automatisering kommer bidra till minskad kunskap och kommunikation, och att en implementation skulle vara både tekniskt och logiskt utmanande.
|
36 |
Exogenous Fault Detection in Aerial Swarms of UAVs / Exogen Feldetektering i Svärmar med UAV:erWestberg, Maja January 2023 (has links)
In this thesis, the main focus is to formulate and test a suitable model forexogenous fault detection in swarms containing unmanned aerial vehicles(UAVs), which are aerial autonomous systems. FOI Swedish DefenseResearch Agency provided the thesis project and research question. Inspiredby previous work, the implementation use behavioral feature vectors (BFVs)to simulate the movements of the UAVs and to identify anomalies in theirbehaviors. The chosen algorithm for fault detection is the density-based cluster analysismethod known as the Local Outlier Factor (LOF). This method is built on thek-Nearest Neighbor(kNN) algorithm and employs densities to detect outliers.In this thesis, it is implemented to detect faulty agents within the swarm basedon their behavior. A confusion matrix and some associated equations are usedto evaluate the accuracy of the method. Six features are selected for examination in the LOF algorithm. The firsttwo features assess the number of neighbors in a circle around the agent,while the others consider traversed distance, height, velocity, and rotation.Three different fault types are implemented and induced in one of the agentswithin the swarm. The first two faults are motor failures, and the last oneis a sensor failure. The algorithm is successfully implemented, and theevaluation of the faults is conducted using three different metrics. Several setsof experiments are performed to assess the optimal value for the LOF thresholdand to understand the model’s performance. The thesis work results in a strongLOF value which yields an acceptable F1 score, signifying the accuracy of theimplementation is at a satisfactory level. / I denna uppsats är huvudfokuset att formulera och testa en lämplig modellför detektion av exogena fel i svärmar som innehåller obemannade flygfordon(UAV:er), vilka utgör autonoma luftburna system. Examensarbetet ochforskningsfrågan tillhandahölls av FOI, Totalförsvarets forskningsinstitut.Inspirerad av tidigare arbete används beteendemässiga egenskapsvektorer(BFV:er) för att simulera rörelserna hos UAV:erna och för att identifieraavvikelser i deras beteenden. Den valda algoritmen för felavkänning är en densitetsbaserad klusterana-lysmetod som kallas Local Outlier Factor (LOF). Denna metod byggerpå k-Nearest Neighbor-algoritmen och använder densiteter för att upptäckaavvikande datapunkter. I denna uppsats implementeras den för att detekterafelaktiga agenter inom svärmen baserat på deras beteende. En förväxlings-matris(Confusion Matrix) och dess tillhörande ekvationer används för attutvärdera metodens noggrannhet. Sex egenskaper valdes för undersökning i LOF-algoritmen. De första tvåegenskaperna bedömer antalet grannar i en cirkel runt agenter, medande andra beaktar avstånd, höjd, hastighet och rotation. Tre olika feltyperimplementeras och framkallas hos en av agenterna inom svärmen. De förstatvå felen är motorfel, och det sista är ett sensorfel. Algoritmen implementerasframgångsrikt och utvärderingen av felen genomförs med hjälp av treolika mått. Ett antal uppsättningar av experiment utförs för att hitta detoptimala värdet för LOF-gränsen och för att förstå modellens prestanda.Examensarbetet resultat är ett optimalt LOF-värde som genererar ettacceptabelt F1-score, vilket innebär att noggrannheten för implementationennår en tillfredsställande nivå.
|
37 |
Estudo, avaliação e comparação de técnicas de detecção não supervisionada de outliers / Study, evaluation and comparison of unsupervised outlier detection techniquesCampos, Guilherme Oliveira 05 March 2015 (has links)
A área de detecção de outliers (ou detecção de anomalias) possui um papel fundamental na descoberta de padrões em dados que podem ser considerados excepcionais sob alguma perspectiva. Detectar tais padrões é relevante de maneira geral porque, em muitas aplicações de mineração de dados, tais padrões representam comportamentos extraordinários que merecem uma atenção especial. Uma importante distinção se dá entre as técnicas supervisionadas e não supervisionadas de detecção. O presente projeto enfoca as técnicas de detecção não supervisionadas. Existem dezenas de algoritmos desta categoria na literatura e novos algoritmos são propostos de tempos em tempos, porém cada um deles utiliza uma abordagem própria do que deve ser considerado um outlier ou não, que é um conceito subjetivo no contexto não supervisionado. Isso dificulta sensivelmente a escolha de um algoritmo em particular em uma dada aplicação prática. Embora seja de conhecimento comum que nenhum algoritmo de aprendizado de máquina pode ser superior a todos os demais em todos os cenários de aplicação, é uma questão relevante se o desempenho de certos algoritmos em geral tende a dominar o de determinados outros, ao menos em classes particulares de problemas. Neste projeto, propõe-se contribuir com o estudo, seleção e pré-processamento de bases de dados que sejam apropriadas para se juntarem a uma coleção de benchmarks para avaliação de algoritmos de detecção não supervisionada de outliers. Propõe-se ainda avaliar comparativamente o desempenho de métodos de detecção de outliers. Durante parte do meu trabalho de mestrado, tive a colaboração intelectual de Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle e, principalmente, Joerg Sander e Arthur Zimek. A contribuição deles foi essencial para as análises dos resultados e a forma compacta de apresentá-los. / The outlier detection area has an essential role in discovering patterns in data that can be considered as exceptional in some perspective. Detect such patterns is important in general because, in many data mining applications, such patterns represent extraordinary behaviors that deserve special attention. An important distinction occurs between supervised and unsupervised detection techniques. This project focuses on the unsupervised detection techniques. There are dozens of algorithms in this category in literature and new algorithms are proposed from time to time, but each of them uses its own approach of what should be considered an outlier or not, which is a subjective concept in the unsupervised context. This considerably complicates the choice of a particular algorithm in a given practical application. While it is common knowledge that no machine learning algorithm can be superior to all others in all application scenarios, it is a relevant question if the performance of certain algorithms in general tends to dominate certain other, at least in particular classes of problems. In this project, proposes to contribute to the databases study, selection and pre-processing that are appropriate to join a benchmark collection for evaluating unsupervised outlier detection algorithms. It is also proposed to evaluate comparatively the performance of outlier detection methods. During part of my master thesis, I had the intellectual collaboration of Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle and especially Joerg Sander and Arthur Zimek. Their contribution was essential for the analysis of the results and the compact way to present them.
|
38 |
Estudo, avaliação e comparação de técnicas de detecção não supervisionada de outliers / Study, evaluation and comparison of unsupervised outlier detection techniquesGuilherme Oliveira Campos 05 March 2015 (has links)
A área de detecção de outliers (ou detecção de anomalias) possui um papel fundamental na descoberta de padrões em dados que podem ser considerados excepcionais sob alguma perspectiva. Detectar tais padrões é relevante de maneira geral porque, em muitas aplicações de mineração de dados, tais padrões representam comportamentos extraordinários que merecem uma atenção especial. Uma importante distinção se dá entre as técnicas supervisionadas e não supervisionadas de detecção. O presente projeto enfoca as técnicas de detecção não supervisionadas. Existem dezenas de algoritmos desta categoria na literatura e novos algoritmos são propostos de tempos em tempos, porém cada um deles utiliza uma abordagem própria do que deve ser considerado um outlier ou não, que é um conceito subjetivo no contexto não supervisionado. Isso dificulta sensivelmente a escolha de um algoritmo em particular em uma dada aplicação prática. Embora seja de conhecimento comum que nenhum algoritmo de aprendizado de máquina pode ser superior a todos os demais em todos os cenários de aplicação, é uma questão relevante se o desempenho de certos algoritmos em geral tende a dominar o de determinados outros, ao menos em classes particulares de problemas. Neste projeto, propõe-se contribuir com o estudo, seleção e pré-processamento de bases de dados que sejam apropriadas para se juntarem a uma coleção de benchmarks para avaliação de algoritmos de detecção não supervisionada de outliers. Propõe-se ainda avaliar comparativamente o desempenho de métodos de detecção de outliers. Durante parte do meu trabalho de mestrado, tive a colaboração intelectual de Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle e, principalmente, Joerg Sander e Arthur Zimek. A contribuição deles foi essencial para as análises dos resultados e a forma compacta de apresentá-los. / The outlier detection area has an essential role in discovering patterns in data that can be considered as exceptional in some perspective. Detect such patterns is important in general because, in many data mining applications, such patterns represent extraordinary behaviors that deserve special attention. An important distinction occurs between supervised and unsupervised detection techniques. This project focuses on the unsupervised detection techniques. There are dozens of algorithms in this category in literature and new algorithms are proposed from time to time, but each of them uses its own approach of what should be considered an outlier or not, which is a subjective concept in the unsupervised context. This considerably complicates the choice of a particular algorithm in a given practical application. While it is common knowledge that no machine learning algorithm can be superior to all others in all application scenarios, it is a relevant question if the performance of certain algorithms in general tends to dominate certain other, at least in particular classes of problems. In this project, proposes to contribute to the databases study, selection and pre-processing that are appropriate to join a benchmark collection for evaluating unsupervised outlier detection algorithms. It is also proposed to evaluate comparatively the performance of outlier detection methods. During part of my master thesis, I had the intellectual collaboration of Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle and especially Joerg Sander and Arthur Zimek. Their contribution was essential for the analysis of the results and the compact way to present them.
|
39 |
Real-time industrial systems anomaly detection with on-edge Tiny Machine LearningTiberg, Anton January 2022 (has links)
Embedded system such as microcontrollers has become more powerful and cheaper during the past couple of years. This has led to more and more development of on-edge applications, one of which is anomaly detection using machine learning. This thesis investigates the ability to implement, deploy and run the unsupervised anomaly detection algorithm called Isolation Forest, and its modified version Mondrian Isolation Forest on a microcontroller. Both algorithms were successfully implemented and deployed. The regular Isolation Forest algorithm resulted in being able to function as an anomaly detection algorithm by using both data sets and streaming data. However, the Mondrian Isolation Forest was too resource hungry to be able to function as a proper anomaly detection application.
|
40 |
A Geometric Approach to Visualization of Variability in Univariate and Multivariate Functional DataXie, Weiyi 07 December 2017 (has links)
No description available.
|
Page generated in 0.0279 seconds