21 |
Geodätische Fehlerrechnung mit der skalenkontaminierten Normalverteilung / Geodetic Error Calculus by the Scale Contaminated Normal DistributionLehmann, Rüdiger 22 January 2015 (has links) (PDF)
Geodätische Messabweichungen werden oft gut durch Wahrscheinlichkeitsverteilungen beschrieben, die steilgipfliger als die Gaußsche Normalverteilung sind. Das gilt besonders, wenn grobe Messabweichungen nicht völlig ausgeschlossen werden können. Neben einigen in der Geodäsie bisher verwendeten Verteilungen (verallgemeinerte Normalverteilung, Hubers Verteilung) diskutieren wir hier die skalenkontaminierte Normalverteilung, die für die praktische Rechnung einige Vorteile bietet. / Geodetic measurement errors are frequently well described by probability distributions, which are more peak-shaped than the Gaussian normal distribution. This is especially true when gross errors cannot be excluded. Besides some distributions used so far in geodesy (generalized normal distribution, Huber’s distribution) we discuss the scale contaminated normal distribution, which offers some advantages in practical calculations.
|
22 |
Multiview 3d Reconstruction Of A Scene Containing Independently Moving ObjectsTola, Engin 01 August 2005 (has links) (PDF)
In this thesis, the structure from motion problem for calibrated scenes containing independently moving objects (IMO) has been studied. For this purpose, the overall reconstruction process is partitioned into various stages. The first stage deals with the fundamental problem of estimating structure and motion by using only two views. This process starts with finding some salient features using a sub-pixel version of the Harris corner detector. The features are matched by the help of a similarity and neighborhood-based matcher. In order to reject the outliers and estimate the fundamental matrix of the two images, a robust estimation is performed via RANSAC and normalized 8-point algorithms. Two-view reconstruction is finalized by decomposing the fundamental matrix and estimating the 3D-point locations as a result of triangulation. The second stage of the reconstruction is the generalization of the two-view algorithm for the N-view case. This goal is accomplished by first reconstructing an initial framework from the first stage and then relating the additional views by finding correspondences between the new view and already reconstructed views. In this way, 3D-2D projection pairs are determined and the projection matrix of this new view is estimated by using a robust procedure. The final section deals with scenes containing IMOs. In order to reject the correspondences due to moving objects, parallax-based rigidity constraint is used. In utilizing this constraint, an automatic background pixel selection algorithm is developed and an IMO rejection algorithm is also proposed. The results of the proposed algorithm are compared against that of a robust outlier rejection algorithm and found to be quite promising in terms of execution time vs. reconstruction quality.
|
23 |
Robust multivariate mixture regression modelsLi, Xiongya January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Weixing Song / In this dissertation, we proposed a new robust estimation procedure for two multivariate mixture regression models and applied this novel method to functional mapping of dynamic traits. In the first part, a robust estimation procedure for the mixture of classical multivariate linear regression models is discussed by assuming that the error terms follow a multivariate Laplace distribution. An EM algorithm is developed based on the fact that the multivariate Laplace distribution is a scale mixture of the multivariate standard normal distribution.
The performance of the proposed algorithm is thoroughly evaluated by some simulation and comparison studies. In the second part, the similar idea is extended to the mixture of linear mixed regression models by assuming that the random effect and the regression error jointly follow a multivariate Laplace distribution. Compared with the existing robust t procedure in the literature, simulation studies indicate that the finite sample performance of the proposed estimation procedure outperforms or is at least comparable to the robust t procedure. Comparing to t procedure, there is no need to determine the degrees of freedom, so the new robust estimation procedure is computationally more efficient than the robust t procedure. The ascent property for both EM algorithms are also proved. In the third part, the proposed robust method is applied to identify quantitative trait loci (QTL) underlying a functional mapping framework with dynamic traits of agricultural or biomedical interest.
A robust multivariate Laplace mapping framework was proposed to replace the normality assumption. Simulation studies show the proposed method is comparable to the robust multivariate t-distribution developed in literature and outperforms the normal procedure.
As an illustration, the proposed method is also applied to a real data set.
|
24 |
Modelos elípticos multiníveis / Multilevel elliptical modelsRoberto Ferreira Manghi 08 December 2011 (has links)
Os modelos multiníveis representam uma classe de modelos utilizada para ajustes de dados que apresentam estrutura de hierarquia. O presente trabalho propõe uma generalizacão dos modelos normais multiníveis, denominada modelos elípticos multiníveis. Esta proposta sugere o uso de distribuicões de probabilidade pertencentes à classe elíptica, envolvendo portanto todas as distribuições contínuas simétricas, incluindo a distribuição normal como caso particular. As distribuições elípticas podem apresentar caudas mais leves ou mais pesadas que as caudas da distribuição normal. No caso da presença de observações aberrantes, é sugerido o uso de distribuições com caudas pesadas no intuito de obter um melhor ajuste do modelo aos dados considerados discrepantes. Nesta dissertação, alguns aspectos dos modelos elípticos multiníveis são desenvolvidos, como o processo de estimação dos parâmetros via máxima verossimilhança, testes de hipóteses para os efeitos fixos e parâmetros de variância e covariância e análise de resíduos para verificação de características relacionadas aos ajustes e às suposições estabelecidas. / Multilevel models represent a class of models used to adjust data which have hierarchical structure. The present work proposes a generalization of the multilevel normal models, named multilevel elliptical models. This proposal suggests the use of probability distributions belonging to the elliptical class, thus involving all symmetric continuous distributions, including the normal distribution as a particular case. Elliptical distributions may have lighter or heavier tails than the normal ones. In case of presence of outlying observations, it is suggested the use of heavy-tailed distributions in order to obtain a better fitted model to the discrepant observations. In this dissertation some aspects of the multilevel elliptical models are developed, such as the process of parameter estimation by maximum likelihood, hypothesis tests for fixed effects and variance-covariance parameters and residual analysis to check features related to the fitting and established assumptions.
|
25 |
Robust Self-Calibration and Fundamental Matrix Estimation in 3D Computer VisionRastgar, Houman January 2013 (has links)
The recent advances in the field of computer vision have brought many of the laboratory algorithms into the realm of industry. However, one problem that still remains open in the field of 3D vision is the problem of noise. The challenging problem of 3D structure recovery from images is highly sensitive to the presence of input data that are contaminated by errors that do not conform to ideal assumptions. Tackling the problem of extreme data, or outliers has led to many robust methods in the field that are able to handle moderate levels of outliers and still provide accurate outputs. However, this problem remains open, especially for higher noise levels and so it has been the goal of this thesis to address the issue of robustness with respect to two central problems in 3D computer vision. The two problems are highly related and they have been presented together within a Structure from Motion (SfM) context. The first, is the problem of robustly estimating the fundamental matrix from images whose correspondences contain high outlier levels. Even though this area has been extensively studied, two algorithms have been proposed that significantly speed up the computation of the fundamental matrix and achieve accurate results in scenarios containing more than 50% outliers. The presented algorithms rely on ideas from the field of robust statistics in order to develop guided sampling techniques that rely on information inferred from residual analysis. The second, problem addressed in this thesis is the robust estimation of camera intrinsic parameters from fundamental matrices, or self-calibration. Self-calibration algorithms are notoriously unreliable for general cases and it is shown that the existing methods are highly sensitive to noise. In spite of this, robustness in self-calibration has received little attention in the literature. Through experimental results, it is shown that it is essential for a real-world self-calibration algorithm to be robust. In order to introduce robustness to the existing methods, three robust algorithms have been proposed that utilize existing constraints for self-calibration from the fundamental matrix. However, the resulting algorithms are less affected by noise than existing algorithms based on these constraints. This is an important milestone since self-calibration offers many possibilities by providing estimates of camera parameters without requiring access to the image acquisition device. The proposed algorithms rely on perturbation theory, guided sampling methods and a robust root finding method for systems of higher order polynomials. By adding robustness to self-calibration it is hoped that this idea is one step closer to being a practical method of camera calibration rather than merely a theoretical possibility.
|
26 |
Estimation robuste pour des distributions à queue lourde / Robust estimation of heavy-tailed distributionsJoly, Emilien 14 December 2015 (has links)
Nous nous intéressons à estimer la moyenne d'une variable aléatoire de loi à queue lourde. Nous adoptons une approche plus robuste que la moyenne empirique classique communément utilisée. L'objectif est de développer des inégalités de concentration de type sous-gaussien sur l'erreur d'estimation. En d'autres termes, nous cherchons à garantir une forte concentration sous une hypothèse plus faible que la bornitude : une variance finie. Deux estimateurs de la moyenne pour une loi à support réel sont invoqués et leurs résultats de concentration sont rappelés. Plusieurs adaptations en dimension supérieure sont envisagées. L'utilisation appropriée de ces estimateurs nous permet d'introduire une nouvelle technique de minimisation du risque empirique pour des variables aléatoires à queue lourde. Quelques applications de cette technique sont développées. Nous appuyons ces résultats sur des simulations sur des jeux de données simulées. Dans un troisième temps, nous étudions un problème d'estimation multivarié dans le cadre des U-statistiques où les estimateurs précédents offrent, là aussi, une généralisation naturelle d'estimateurs présents dans la littérature. / In this thesis, we are interested in estimating the mean of heavy-tailed random variables. We focus on a robust estimation of the mean approach as an alternative to the classical empirical mean estimation. The goal is to develop sub-Gaussian concentration inequalities for the estimating error. In other words, we seek strong concentration results usually obtained for bounded random variables, in the context where the bounded condition is replaced by a finite variance condition. Two existing estimators of the mean of a real-valued random variable are invoked and their concentration results are recalled. Several new higher dimension adaptations are discussed. Using those estimators, we introduce a new version of empirical risk minimization for heavy-tailed random variables. Some applications are developed. These results are illustrated by simulations on artificial data samples. Lastly, we study the multivariate case in the U-statistics context. A natural generalization of existing estimators is offered, once again, by previous estimators.
|
27 |
L1 regrese / L1 RegressionČelikovská, Klára January 2020 (has links)
This thesis is focused on the L1 regression, a possible alternative to the ordinary least squares regression. L1 regression replaces the least squares estimation with the least absolute deviations estimation, thus generalizing the sample median in the linear regres- sion model. Unlike the ordinary least squares regression, L1 regression enables loosening of certain assumptions and leads to more robust estimates. Fundamental theoretical re- sults, including the asymptotic distribution of regression coefficient estimates, hypothesis testing, confidence intervals and confidence regions, are derived. This method is then compared to the ordinary least squares regression in a simulation study, with a focus on heavy-tailed distributions and the possible presence of outlying observations. 1
|
28 |
Robust Approaches for Matrix-Valued ParametersJing, Naimin January 2021 (has links)
Modern large data sets inevitably contain outliers that deviate from the model assumptions. However, many widely used estimators, such as maximum likelihood estimators and least squared estimators, perform weakly with the existence of outliers. Alternatively, many statistical modeling approaches have matrices as the parameters. We consider penalized estimators for matrix-valued parameters with a focus on their robustness properties in the presence of outliers. We propose a general framework for robust modeling with matrix-valued parameters by minimizing robust loss functions with penalization. However, there are challenges to this approach in both computation and theoretical analysis. To tackle the computational challenges from the large size of the data, non-smoothness of robust loss functions, and the slow speed of matrix operations, we propose to apply the Frank-Wolfe algorithm, a first-order algorithm for optimization on a restricted region with low computation burden per iteration. Theoretically, we establish finite-sample error bounds under high-dimensional settings. We show that the estimation errors are bounded by small terms and converge in probability to zero under mild conditions in a neighborhood of the true model. Our method accommodates a broad classes of modeling problems using robust loss functions with penalization. Concretely, we study three cases: matrix completion, multivariate regression, and network estimation. For all cases, we illustrate the robustness of the proposed method both theoretically and numerically. / Statistics
|
29 |
Relational Outlier Detection: Techniques and ApplicationsLu, Yen-Cheng 10 June 2021 (has links)
Nowadays, outlier detection has attracted growing interest. Unlike typical outlier detection problems, relational outlier detection focuses on detecting abnormal patterns in datasets that contain relational implications within each data point.
Furthermore, different from the traditional outlier detection that focuses on only numerical data, modern outlier detection models must be able to handle data in various types and structures.
Detecting relational outliers should consider (1) Dependencies among different data types,
(2) Data types that are not continuous or do not have ordinal characteristics, such as binary, categorical or multi-label, and
(3) Special structures in the data. This thesis focuses on the development of relational outlier detection methods and real-world applications in datasets that contain non-numerical, mixed-type, and special structure data in three tasks, namely (1) outlier detection in mixed-type data, (2) categorical outlier detection in music genre data, and (3) outlier detection in categorized time series data.
For the first task, existing solutions for mixed-type data mostly focus on computational efficiency, and their strategies are mostly heuristic driven, lacking a statistical foundation. The proposed contributions of our work include: (1) Constructing a novel unsupervised framework based on a robust generalized linear model (GLM), (2) Developing a model that is capable of capturing large variances of outliers and dependencies among mixed-type observations, and designing an approach for approximating the analytically intractable Bayesian inference, and (3) Conducting extensive experiments to validate effectiveness and efficiency.
For the second task, we extended and applied the modeling strategy to a real-world problem. The existing solutions to the specific task are mostly supervised, and the traditional outlier detection methods only focus on detecting outliers by the data distributions, ignoring the input-output relation between the genres and the extracted features. The proposed contributions of our work for this task include: (1) Proposing an unsupervised outlier detection framework for music genre data, (2) Extending the GLM based model in the first task to handle categorical responses and developing an approach to approximate the analytically intractable Bayesian inference, and (3) Conducting experiments to demonstrate that the proposed method outperforms the benchmark methods.
For the third task, we focused on improving the outlier detection performance in the second task by proposing a novel framework and expanded the research scope to general categorized time-series data. Existing studies have suggested a large number of methods for automatic time series classification. However, there is a lack of research focusing on detecting outliers from manually categorized time series. The proposed contributions of our work for this task include: (1) Proposing a novel semi-supervised robust outlier detection framework for categorized time-series datasets, (2) Further extending the new framework to an active learning system that takes user insights into account, and (3) Conducting a comprehensive set of experiments to demonstrate the performance of the proposed method in real-world applications. / Doctor of Philosophy / In recent years, outlier detection has been one of the most important topics in the data mining and machine learning research domain. Unlike typical outlier detection problems, relational outlier detection focuses on detecting abnormal patterns in datasets that contain relational implications within each data point. Detecting relational outliers should consider (1) Dependencies among different data types, (2) Data types that are not continuous or do not have ordinal characteristics, such as binary, categorical or multi-label, and (3) Special structures in the data. This thesis focuses on the development of relational outlier detection methods and real-world applications in datasets that contain non-numerical, mixed-type, and special structure data in three tasks, namely (1) outlier detection in mixed-type data, (2) categorical outlier detection in music genre data, and (3) outlier detection in categorized time series data. The first task aims on constructing a novel unsupervised framework, developing a model that is capable of capturing the normal pattern and the effects, and designing an approach for model fitting. In the second task, we further extended and applied the modeling strategy to a real-world problem in the music technology domain. For the third task, we expanded the research scope from the previous task to general categorized time-series data, and focused on improving the outlier detection performance by proposing a novel semi-supervised framework.
|
30 |
Robust Estimation of Autoregressive Conditional Duration ModelsEl, Sebai S Rola 10 1900 (has links)
<p>In this thesis, we apply the Ordinary Least Squares (OLS) and the Generalized Least Squares (GLS) methods for the estimation of Autoregressive Conditional Duration (ACD) models, as opposed to the typical approach of using the Quasi Maximum Likelihood Estimation (QMLE).</p> <p>The advantages of OLS and GLS as the underlying methods of estimation lie in their theoretical ease and computational convenience. The latter property is crucial for high frequency trading, where a transaction decision needs to be made within a minute. We show that both OLS and GLS estimates are asymptotically consistent and normally distributed. The normal approximation does not seem to be satisfactory in small samples. We also apply Residual Bootstrap to construct the confidence intervals based on the OLS and GLS estimates. The properties of the proposed methods are illustrated with intensive numerical simulations as well as by a case study on the IBM transaction data.</p> / Master of Science (MSc)
|
Page generated in 0.2573 seconds