Global ETD Search

11	Multivariate Functional Data Analysis and Visualization Qu, Zhuo 11 1900 (has links) As a branch of statistics, functional data analysis (FDA) studies observations regarded as curves, surfaces, or other objects evolving over a continuum. Although one has seen a flourishing of methods and theories on FDA, two issues are observed. Firstly, the functional data are sampled from common time grids; secondly, methods developed only for univariate functional data are challenging to be applied to multivariate functional data. After exploring model-based fitting for regularly observed multivariate functional data, we explore new visualization tools, clustering, and multivariate functional depths for irregularly observed (sparse) multivariate functional data. The four main chapters that comprise the dissertation are organized as follows. First, median polish for functional multivariate analysis of variance (FMANOVA) is proposed with the implementation of multivariate functional depths in Chapter 2. Numerical studies and environmental datasets are considered to illustrate the robustness of median polish. Second, the sparse functional boxplot and the intensity sparse functional boxplot, as practical exploratory tools that make visualization possible for both complete and sparse functional data, are introduced in Chapter 3. These visualization tools depict sparseness characteristics in the proportion of sparseness and relative intensity of fitted sparse points inside the central region, respectively. Third, a robust distance-based robust two-layer partition (RTLP) clustering of sparse multivariate functional data is introduced in Chapter 4. The RTLP clustering is based on our proposed elastic time distance (ETD) specifically for sparse multivariate functional data. Lastly, the multivariate functional integrated depth and the multivariate functional extremal depth based on multivariate depths are proposed in Chapter 5. Global and local formulas for each depth are explored, with theoretical properties being proved and the finite sample depth estimation for irregularly observed multivariate functional data being investigated. In addition, the simplified sparse functional boxplot and simplified intensity sparse functional boxplot for visualization without data reconstruction are introduced. Together, these four extensions to multivariate functional data make them more general and of applicational interest in exploratory multivariate functional data analysis. clustering median polish multivariate functional data outlier detection robustness visualization
12	A Curriculum Guide for Integrating Literary Theory into Twelfth Grade Florida english Language Arts Philpot, Helen 01 January 2007 (has links) Providing high school students a course of study for becoming competent and thorough lifelong independent readers of complex texts was the goal for this thesis. This is accomplished by integrating literary theory that looks beyond just the typical level of analysis often emphasized in many Florida classrooms. If put into use and successful, this curriculum guide will aid Florida teachers in endowing their students with a new level of ability to analyze literature. Research of prior work done in the field of integrating critical theory into high school classrooms was analyzed and synthesized in order to create a larger course of critical theory study to be completed during the senior year of high school in the state of Florida. The curriculum guide acts as a starting point, providing teachers with all the tools necessary to bring literary theory into the high school classroom while maintaining their individual teaching style. The curriculum guide is broken into four distinct units which follow the most common course of Florida twelfth grade study, the English canon, with each chapter addressing two literary theories. The literary theories utilized are: New Criticism, New Historicism, Feminism, Marxism, Reader Response, Psychoanalysis, Structuralism, and Deconstruction. Education
13	Unsupervised Anomaly Detection in Numerical Datasets Joshi, Vineet 05 June 2015 (has links) No description available. Computer Science Data Mining Anomaly Detection Outlier Detection Subspaces
14	Automated Outlier Detection for Credit Risk KPI Time Series in E-commerce : A Case Study on the Business Value and Obstacles of Automated Outlier Detection / Automatiserad Outlier Detection för Kreditrisk KPI Tidsserier i E-handel Lindberg, Jennifer January 2022 (has links) E-commerce has grown significantly the last decade, and made a considerable leap during Covid19. The final step in e-commerce is payments, and as a result of this, credit risk management in real-time has become increasingly important. An imperative function in credit risk management is underwriting, in which it is decided which purchases to accept and which not to. However, events can occur that cause increases or decreases in for instance acceptance rates, and these must be detected in order to for instance maintain good stakeholder relationships. Thus, KPI:s are monitored with the aim of detecting outliers as soon as possible. The purpose of this study is to explore the business value and obstacles of automating outlier detection for credit risk KPI time series in e-commerce. In addition, aspects to think about on implementation are investigated. The research is a case study and is founded in thematic analysis of qualitative data collected at an e-commerce company. The results of the study show that automation can contribute to significant business value due to for instance a decrease in monetary and alternative costs of manual monitoring, as well as a potential for better quality in the monitoring, and thus also enhanced stakeholder relationships. However, results also imply that there are several obstacles to actually implementing full automation such as a lack of trust in the automation, along with opinions that automation will impair knowledge and communication, and that the implementation is complex. / Under det senaste årtiondet har e-handel signifikant växt, och under Covid19 eskalerade utvecklingen ännu mer. Det sista steget i e-handel är betalningar, och till följd av detta har kreditriskhantering blivit allt viktigare. En signifikant funktion i kreditriskhantering är underwriting, där det bestäms vilka köp som skall accepteras och inte. Dock kan händelser ske som ökar eller minskar till exempel andelen köp som accepteras, och dessa händelser måste identifieras bland annat för att kunna upprätthålla goda relationer med företagets intressenter. Således monitoreras KPI:er med syftet att upptäcka anomalier så tidigt som möjligt. Syftet med denna studie är att undersöka affärsvärdet, samt barriärer, av implementation av automatiserad outlier detection för kreditrisk KPI tidsserier i e-handel. Denna forskning är en fallstudie som grundas i tematisk analys av kvalitativ data som samlas in på ett e-handelsföretag. Vidare visar resultaten av studien att automatisering kan bidra till betydande affärsvärde bland annat till följd av minskade monetära såväl som alternativa kostnader från manuell monitorering, samt potential till bättre kvalitet i monitoreringen och således förbättrade intressentrelationer. Dock tyder resultaten även på att det finns ett flertal hinder för att faktiskt implementera full automatisering såsom brist på tillit till automatisering, tillsammans med åsikter såsom att automatisering kommer bidra till minskad kunskap och kommunikation, och att en implementation skulle vara både tekniskt och logiskt utmanande. Credit Risk Management Underwriting Automation Outlier Detection KPI Monitoring Kresditriskhantering underwriting automatisering outlier detection KPI monitorering Engineering and Technology Teknik och teknologier
15	Estudo, avaliação e comparação de técnicas de detecção não supervisionada de outliers / Study, evaluation and comparison of unsupervised outlier detection techniques Campos, Guilherme Oliveira 05 March 2015 (has links) A área de detecção de outliers (ou detecção de anomalias) possui um papel fundamental na descoberta de padrões em dados que podem ser considerados excepcionais sob alguma perspectiva. Detectar tais padrões é relevante de maneira geral porque, em muitas aplicações de mineração de dados, tais padrões representam comportamentos extraordinários que merecem uma atenção especial. Uma importante distinção se dá entre as técnicas supervisionadas e não supervisionadas de detecção. O presente projeto enfoca as técnicas de detecção não supervisionadas. Existem dezenas de algoritmos desta categoria na literatura e novos algoritmos são propostos de tempos em tempos, porém cada um deles utiliza uma abordagem própria do que deve ser considerado um outlier ou não, que é um conceito subjetivo no contexto não supervisionado. Isso dificulta sensivelmente a escolha de um algoritmo em particular em uma dada aplicação prática. Embora seja de conhecimento comum que nenhum algoritmo de aprendizado de máquina pode ser superior a todos os demais em todos os cenários de aplicação, é uma questão relevante se o desempenho de certos algoritmos em geral tende a dominar o de determinados outros, ao menos em classes particulares de problemas. Neste projeto, propõe-se contribuir com o estudo, seleção e pré-processamento de bases de dados que sejam apropriadas para se juntarem a uma coleção de benchmarks para avaliação de algoritmos de detecção não supervisionada de outliers. Propõe-se ainda avaliar comparativamente o desempenho de métodos de detecção de outliers. Durante parte do meu trabalho de mestrado, tive a colaboração intelectual de Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle e, principalmente, Joerg Sander e Arthur Zimek. A contribuição deles foi essencial para as análises dos resultados e a forma compacta de apresentá-los. / The outlier detection area has an essential role in discovering patterns in data that can be considered as exceptional in some perspective. Detect such patterns is important in general because, in many data mining applications, such patterns represent extraordinary behaviors that deserve special attention. An important distinction occurs between supervised and unsupervised detection techniques. This project focuses on the unsupervised detection techniques. There are dozens of algorithms in this category in literature and new algorithms are proposed from time to time, but each of them uses its own approach of what should be considered an outlier or not, which is a subjective concept in the unsupervised context. This considerably complicates the choice of a particular algorithm in a given practical application. While it is common knowledge that no machine learning algorithm can be superior to all others in all application scenarios, it is a relevant question if the performance of certain algorithms in general tends to dominate certain other, at least in particular classes of problems. In this project, proposes to contribute to the databases study, selection and pre-processing that are appropriate to join a benchmark collection for evaluating unsupervised outlier detection algorithms. It is also proposed to evaluate comparatively the performance of outlier detection methods. During part of my master thesis, I had the intellectual collaboration of Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle and especially Joerg Sander and Arthur Zimek. Their contribution was essential for the analysis of the results and the compact way to present them. Benchmark for outlier detection Evaluation measures Métricas de avaliação Unsupervised outlier detection
16	Estudo, avaliação e comparação de técnicas de detecção não supervisionada de outliers / Study, evaluation and comparison of unsupervised outlier detection techniques Guilherme Oliveira Campos 05 March 2015 (has links) A área de detecção de outliers (ou detecção de anomalias) possui um papel fundamental na descoberta de padrões em dados que podem ser considerados excepcionais sob alguma perspectiva. Detectar tais padrões é relevante de maneira geral porque, em muitas aplicações de mineração de dados, tais padrões representam comportamentos extraordinários que merecem uma atenção especial. Uma importante distinção se dá entre as técnicas supervisionadas e não supervisionadas de detecção. O presente projeto enfoca as técnicas de detecção não supervisionadas. Existem dezenas de algoritmos desta categoria na literatura e novos algoritmos são propostos de tempos em tempos, porém cada um deles utiliza uma abordagem própria do que deve ser considerado um outlier ou não, que é um conceito subjetivo no contexto não supervisionado. Isso dificulta sensivelmente a escolha de um algoritmo em particular em uma dada aplicação prática. Embora seja de conhecimento comum que nenhum algoritmo de aprendizado de máquina pode ser superior a todos os demais em todos os cenários de aplicação, é uma questão relevante se o desempenho de certos algoritmos em geral tende a dominar o de determinados outros, ao menos em classes particulares de problemas. Neste projeto, propõe-se contribuir com o estudo, seleção e pré-processamento de bases de dados que sejam apropriadas para se juntarem a uma coleção de benchmarks para avaliação de algoritmos de detecção não supervisionada de outliers. Propõe-se ainda avaliar comparativamente o desempenho de métodos de detecção de outliers. Durante parte do meu trabalho de mestrado, tive a colaboração intelectual de Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle e, principalmente, Joerg Sander e Arthur Zimek. A contribuição deles foi essencial para as análises dos resultados e a forma compacta de apresentá-los. / The outlier detection area has an essential role in discovering patterns in data that can be considered as exceptional in some perspective. Detect such patterns is important in general because, in many data mining applications, such patterns represent extraordinary behaviors that deserve special attention. An important distinction occurs between supervised and unsupervised detection techniques. This project focuses on the unsupervised detection techniques. There are dozens of algorithms in this category in literature and new algorithms are proposed from time to time, but each of them uses its own approach of what should be considered an outlier or not, which is a subjective concept in the unsupervised context. This considerably complicates the choice of a particular algorithm in a given practical application. While it is common knowledge that no machine learning algorithm can be superior to all others in all application scenarios, it is a relevant question if the performance of certain algorithms in general tends to dominate certain other, at least in particular classes of problems. In this project, proposes to contribute to the databases study, selection and pre-processing that are appropriate to join a benchmark collection for evaluating unsupervised outlier detection algorithms. It is also proposed to evaluate comparatively the performance of outlier detection methods. During part of my master thesis, I had the intellectual collaboration of Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle and especially Joerg Sander and Arthur Zimek. Their contribution was essential for the analysis of the results and the compact way to present them. Métricas de avaliação Benchmark for outlier detection Evaluation measures Unsupervised outlier detection
17	Real-time industrial systems anomaly detection with on-edge Tiny Machine Learning Tiberg, Anton January 2022 (has links) Embedded system such as microcontrollers has become more powerful and cheaper during the past couple of years. This has led to more and more development of on-edge applications, one of which is anomaly detection using machine learning. This thesis investigates the ability to implement, deploy and run the unsupervised anomaly detection algorithm called Isolation Forest, and its modified version Mondrian Isolation Forest on a microcontroller. Both algorithms were successfully implemented and deployed. The regular Isolation Forest algorithm resulted in being able to function as an anomaly detection algorithm by using both data sets and streaming data. However, the Mondrian Isolation Forest was too resource hungry to be able to function as a proper anomaly detection application. Machine learning anomaly detection outlier detection real time isolation forest mondrian isolation forest tiny machine learning embedded arduino anomaly detection outlier detection Embedded Systems Inbäddad systemteknik
18	A Geometric Approach to Visualization of Variability in Univariate and Multivariate Functional Data Xie, Weiyi 07 December 2017 (has links) No description available. Statistics function visualization curve visualization functional boxplots curve boxplots separation of variabilities Fisher-Rao metric elastic metric functional outlier detection curve outlier detection
19	Robust mixture modeling Yu, Chun January 1900 (has links) Doctor of Philosophy / Department of Statistics / Weixin Yao and Kun Chen / Ordinary least-squares (OLS) estimators for a linear model are very sensitive to unusual values in the design space or outliers among y values. Even one single atypical value may have a large effect on the parameter estimates. In this proposal, we first review and describe some available and popular robust techniques, including some recent developed ones, and compare them in terms of breakdown point and efficiency. In addition, we also use a simulation study and a real data application to compare the performance of existing robust methods under different scenarios. Finite mixture models are widely applied in a variety of random phenomena. However, inference of mixture models is a challenging work when the outliers exist in the data. The traditional maximum likelihood estimator (MLE) is sensitive to outliers. In this proposal, we propose a Robust Mixture via Mean shift penalization (RMM) in mixture models and Robust Mixture Regression via Mean shift penalization (RMRM) in mixture regression, to achieve simultaneous outlier detection and parameter estimation. A mean shift parameter is added to the mixture models, and penalized by a nonconvex penalty function. With this model setting, we develop an iterative thresholding embedded EM algorithm to maximize the penalized objective function. Comparing with other existing robust methods, the proposed methods show outstanding performance in both identifying outliers and estimating the parameters. Robust Outlier detection Mixture models EM algorithm Penalized likelihood Statistics (0463)
20	Tackling the Antibiotic Resistant Bacteria Crisis Using Longitudinal Antibiograms Tlachac, Monica 31 May 2018 (has links) Antibiotic resistant bacteria, a growing health crisis, arise due to antibiotic overuse and misuse. Resistant infections endanger the lives of patients and are financially burdensome. Aggregate antimicrobial susceptibility reports, called antibiograms, are critical for tracking antibiotic susceptibility and evaluating the likelihood of the effectiveness of different antibiotics to treat an infection prior to the availability of patient specific susceptibility data. This research leverages the Massachusetts Statewide Antibiogram database, a rich dataset composed of antibiograms for $754$ antibiotic-bacteria pairs collected by the Massachusetts Department of Public Health from $2002$ to $2016$. However, these antibiograms are at least a year old, meaning antibiotics are prescribed based on outdated data which unnecessarily furthers resistance. Our objective is to employ data science techniques on these antibiograms to assist in developing more responsible antibiotic prescription practices. First, we use model selectors with regression-based techniques to forecast the current antimicrobial resistance. Next, we develop an assistant to immediately identify clinically and statistically significant changes in antimicrobial resistance between years once the most recent year of antibiograms are collected. Lastly, we use k-means clustering on resistance trends to detect antibiotic-bacteria pairs with resistance trends for which forecasting will not be effective. These three strategies can be implemented to guide more responsible antibiotic prescription practices and thus reduce unnecessary increases in antibiotic resistance. Antibiograms Antimicrobial Resistance ARIMA Clinical Significance Model Selector Outlier Detection Regression Statistical Significance Support Vector Regression

Search results