Global ETD Search

1	Adaptive Robust Regression Approaches in data analysis and their Applications Zhang, Zongjun January 2015 (has links) No description available. Statistics Adaptive Robust M-Estimator tuning constant tail weight index
2	Semi-Supervised Half-Quadratic Nonnegative Matrix Factorization for Face Recognition Alghamdi, Masheal M. 05 1900 (has links) Face recognition is a challenging problem in computer vision. Difficulties such as slight differences between similar faces of different people, changes in facial expressions, light and illumination condition, and pose variations add extra complications to the face recognition research. Many algorithms are devoted to solving the face recognition problem, among which the family of nonnegative matrix factorization (NMF) algorithms has been widely used as a compact data representation method. Different versions of NMF have been proposed. Wang et al. proposed the graph-based semi-supervised nonnegative learning (S2N2L) algorithm that uses labeled data in constructing intrinsic and penalty graph to enforce separability of labeled data, which leads to a greater discriminating power. Moreover the geometrical structure of labeled and unlabeled data is preserved through using the smoothness assumption by creating a similarity graph that conserves the neighboring information for all labeled and unlabeled data. However, S2N2L is sensitive to light changes, illumination, and partial occlusion. In this thesis, we propose a Semi-Supervised Half-Quadratic NMF (SSHQNMF) algorithm that combines the benefits of S2N2L and the robust NMF by the half- quadratic minimization (HQNMF) algorithm.Our algorithm improves upon the S2N2L algorithm by replacing the Frobenius norm with a robust M-Estimator loss function. A multiplicative update solution for our SSHQNMF algorithmis driven using the half- 4 quadratic (HQ) theory. Extensive experiments on ORL, Yale-A and a subset of the PIE data sets for nine M-estimator loss functions for both SSHQNMF and HQNMF algorithms are investigated, and compared with several state-of-the-art supervised and unsupervised algorithms, along with the original S2N2L algorithm in the context of classification, clustering, and robustness against partial occlusion. The proposed algorithm outperformed the other algorithms. Furthermore, SSHQNMF with Maximum Correntropy (MC) loss function obtained the best results for most test cases. Semi-Supervised NMF Face recognition M-estimator Half-Quadratic Graph embedding
3	New results in detection, estimation, and model selection Ni, Xuelei 08 December 2005 (has links) This thesis contains two parts: the detectability of convex sets and the study on regression models In the first part of this dissertation, we investigate the problem of the detectability of an inhomogeneous convex region in a Gaussian random field. The first proposed detection method relies on checking a constructed statistic on each convex set within an nn image, which is proven to be un-applicable. We then consider using h(v)-parallelograms as the surrogate, which leads to a multiscale strategy. We prove that 2/9 is the minimum proportion of the maximally embedded h(v)-parallelogram in a convex set. Such a constant indicates the effectiveness of the above mentioned multiscale detection method. In the second part, we study the robustness, the optimality, and the computing for regression models. Firstly, for robustness, M-estimators in a regression model where the residuals are of unknown but stochastically bounded distribution are analyzed. An asymptotic minimax M-estimator (RSBN) is derived. Simulations demonstrate the robustness and advantages. Secondly, for optimality, the analysis on the least angle regressions inspired us to consider the conditions under which a vector is the solution of two optimization problems. For these two problems, one can be solved by certain stepwise algorithms, the other is the objective function in many existing subset selection criteria (including Cp, AIC, BIC, MDL, RIC, etc). The latter is proven to be NP-hard. Several conditions are derived. They tell us when a vector is the common optimizer. At last, extending the above idea about finding conditions into exhaustive subset selection in regression, we improve the widely used leaps-and-bounds algorithm (Furnival and Wilson). The proposed method further reduces the number of subsets needed to be considered in the exhaustive subset search by considering not only the residuals, but also the model matrix, and the current coefficients. Detectability Convex sets Variable selection Least angle regressions Leaps and bounds M-estimator Regression analysis Convex sets Mathematical analysis
4	Spatial Pattern of Yield Distributions: Implications for Crop Insurance Annan, Francis 11 August 2012 (has links) Despite the potential benefits of larger datasets for crop insurance ratings, pooling yields with similar distributions is not a common practice. The current USDA-RMA county insurance ratings do not consider information across state lines, a politically driven assumption that ignores a wealth of climate and agronomic evidence suggesting that growing regions are not constrained by state boundaries. We test the appropriateness of this assumption, and provide empirical grounds for benefits of pooling datasets. We find evidence in favor of pooling across state lines, with poolable counties sometimes being as far as 2,500 miles apart. An out-of-sample performance exercise suggests our proposed pooling framework out-performs a no-pooling alternative, and supports the hypothesis that economic losses should be expected as a result of not adopting our pooling framework. Our findings have strong empirical and policy implications for accurate modeling of yield distributions and vis-à-vis the rating of crop insurance products. hypothesis testing loss ratio indemnities seemingly unrelated regression mean squared error robust M-estimator linear spline function
5	Second-order Least Squares Estimation in Generalized Linear Mixed Models Li, He 06 April 2011 (has links) Maximum likelihood is an ubiquitous method used in the estimation of generalized linear mixed model (GLMM). However, the method entails computational difficulties and relies on the normality assumption for random effects. We propose a second-order least squares (SLS) estimator based on the first two marginal moments of the response variables. The proposed estimator is computationally feasible and requires less distributional assumptions than the maximum likelihood estimator. To overcome the numerical difficulties of minimizing an objective function that involves multiple integrals, a simulation-based SLS estimator is proposed. We show that the SLS estimators are consistent and asymptotically normally distributed under fairly general conditions in the framework of GLMM. Missing data is almost inevitable in longitudinal studies. Problems arise if the missing data mechanism is related to the response process. This thesis develops the proposed estimators to deal with response data missing at random by either adapting the inverse probability weight method or applying the multiple imputation approach. In practice, some of the covariates are not directly observed but are measured with error. It is well-known that simply substituting a proxy variable for the unobserved covariate in the model will generally lead to biased and inconsistent estimates. We propose the instrumental variable method for the consistent estimation of GLMM with covariate measurement error. The proposed approach does not need any parametric assumption on the distribution of the unknown covariates. This makes the method less restrictive than other methods that rely on either a parametric distribution of the covariates, or to estimate the distribution using some extra information. In the presence of data outliers, it is a concern that the SLS estimators may be vulnerable due to the second-order moments. We investigated the robustness property of the SLS estimators using their influence functions. We showed that the proposed estimators have a bounded influence function and a redescending property so they are robust to outliers. The finite sample performance and property of the SLS estimators are studied and compared with other popular estimators in the literature through simulation studies and real world data examples. Bias reduction Discrete response Influence function Instrumental variable Least squares method Longitudinal data Measurement error M-estimator Mixed effects models Outliers Robustness Simulation-based estimator
6	Second-order Least Squares Estimation in Generalized Linear Mixed Models Li, He 06 April 2011 (has links) Maximum likelihood is an ubiquitous method used in the estimation of generalized linear mixed model (GLMM). However, the method entails computational difficulties and relies on the normality assumption for random effects. We propose a second-order least squares (SLS) estimator based on the first two marginal moments of the response variables. The proposed estimator is computationally feasible and requires less distributional assumptions than the maximum likelihood estimator. To overcome the numerical difficulties of minimizing an objective function that involves multiple integrals, a simulation-based SLS estimator is proposed. We show that the SLS estimators are consistent and asymptotically normally distributed under fairly general conditions in the framework of GLMM. Missing data is almost inevitable in longitudinal studies. Problems arise if the missing data mechanism is related to the response process. This thesis develops the proposed estimators to deal with response data missing at random by either adapting the inverse probability weight method or applying the multiple imputation approach. In practice, some of the covariates are not directly observed but are measured with error. It is well-known that simply substituting a proxy variable for the unobserved covariate in the model will generally lead to biased and inconsistent estimates. We propose the instrumental variable method for the consistent estimation of GLMM with covariate measurement error. The proposed approach does not need any parametric assumption on the distribution of the unknown covariates. This makes the method less restrictive than other methods that rely on either a parametric distribution of the covariates, or to estimate the distribution using some extra information. In the presence of data outliers, it is a concern that the SLS estimators may be vulnerable due to the second-order moments. We investigated the robustness property of the SLS estimators using their influence functions. We showed that the proposed estimators have a bounded influence function and a redescending property so they are robust to outliers. The finite sample performance and property of the SLS estimators are studied and compared with other popular estimators in the literature through simulation studies and real world data examples. Bias reduction Discrete response Influence function Instrumental variable Least squares method Longitudinal data Measurement error M-estimator Mixed effects models Outliers Robustness Simulation-based estimator
7	Least squares estimation for binary decision trees Albrecht, Nadine 14 December 2020 (has links) In this thesis, a binary decision tree is used as an approximation of a nonparametric regression curve. The best fitted decision tree is estimated from data via least squares method. It is investigated how and under which conditions the estimator converges. These asymptotic results then are used to create asymptotic convergence regions. info:eu-repo/classification/ddc/620 ddc:620
8	Robust Registration of ToF and RGB-D Camera Point Clouds / Robust registrering av punktmoln från ToF och RGB-D kamera Chen, Shuo January 2021 (has links) This thesis presents a comparison of M-estimator, BLAVE, and RANSAC method in point clouds registration. The comparison is performed empirically by applying all the estimators on a simulated data added with noise plus gross errors, ToF data and RGB-D data. The RANSAC method is the fastest and most robust estimator from the comparison. The 2D feature extracting methods Harris corner detector, SIFT and SURF and 3D extracting method ISS are compared in the real-world scene data as well. SIFT algorithm is proven to have extracted the most feature points with accurate features among all the extracting methods in different data. In the end, ICP algorithm is used to refine the registration result based on the estimation of initial transform. / Denna avhandling presenterar en jämförelse av tre metoder för registrering av punktmoln: M-estimator, BLAVE och RANSAC. Jämförelsen utfördes empiriskt genom att använda alla metoder på simulerad data med brus och grova fel samt på ToF - och RGB-D -data. Tester visade att RANSAC-metoden är den snabbaste och mest robusta metoden. Vi har även jämfört tre metoder för extrahering av features från 2D-bilder: Harris hörndetektor, SIFT och SURF och en 3D extraheringsmetod ISS. Denna jämförelse utfördes md hjälp av verkliga data. SIFT -algoritmen har visat sig fungera bäst bland alla extraheringsmetoder: den har extraherat flesta features med högst precision. I slutändan användes ICP-algoritmen för att förfina registreringsresultatet baserat på uppskattningen av initial transformering. Robustness Estimator Feature extraction RANSAC BLAVE M-estimator ToF RGB-D Robusthet Uppskattare Särdragsextraktion RANSAC BLAVE M-uppskattare ToF RGB-D Civil Engineering Samhällsbyggnadsteknik
9	Tail behaviour analysis and robust regression meets modern methodologies Wang, Bingling 11 March 2024 (has links) Diese Arbeit stellt Modelle und Methoden vor, die für robuste Statistiken und ihre Anwendungen in verschiedenen Bereichen entwickelt wurden. Kapitel 2 stellt einen neuartigen Partitionierungs-Clustering-Algorithmus vor, der auf Expectiles basiert. Der Algorithmus bildet Cluster, die sich an das Endverhalten der Clusterverteilungen anpassen und sie dadurch robuster machen. Das Kapitel stellt feste Tau-Clustering- und adaptive Tau-Clustering-Schemata und ihre Anwendungen im Kryptowährungsmarkt und in der Bildsegmentierung vor. In Kapitel 3 wird ein faktorerweitertes dynamisches Modell vorgeschlagen, um das Tail-Verhalten hochdimensionaler Zeitreihen zu analysieren. Dieses Modell extrahiert latente Faktoren, die durch Extremereignisse verursacht werden, und untersucht ihre Wechselwirkung mit makroökonomischen Variablen mithilfe des VAR-Modells. Diese Methodik ermöglicht Impuls-Antwort-Analysen, Out-of-Sample-Vorhersagen und die Untersuchung von Netzwerkeffekten. Die empirische Studie stellt den signifikanten Einfluss von durch finanzielle Extremereignisse bedingten Faktoren auf makroökonomische Variablen während verschiedener Wirtschaftsperioden dar. Kapitel 4 ist eine Pilotanalyse zu Non Fungible Tokens (NFTs), insbesondere CryptoPunks. Der Autor untersucht die Clusterbildung zwischen digitalen Assets mithilfe verschiedener Visualisierungstechniken. Die durch CNN- und UMAP-Regression identifizierten Cluster werden mit Preisen und Merkmalen von CryptoPunks in Verbindung gebracht. Kapitel 5 stellt die Konstruktion eines Preisindex namens Digital Art Index (DAI) für den NFT-Kunstmarkt vor. Der Index wird mithilfe hedonischer Regression in Kombination mit robusten Schätzern für die Top-10-Liquid-NFT-Kunstsammlungen erstellt. Es schlägt innovative Verfahren vor, nämlich Huberisierung und DCS-t-Filterung, um abweichende Preisbeobachtungen zu verarbeiten und einen robusten Index zu erstellen. Darüber hinaus werden Preisdeterminanten des NFT-Marktes analysiert. / This thesis provides models and methodologies developed on robust statistics and their applications in various domains. Chapter 2 presents a novel partitioning clustering algorithm based on expectiles. The algorithm forms clusters that adapt to the tail behavior of the cluster distributions, making them more robust. The chapter introduces fixed tau-clustering and adaptive tau-clustering schemes and their applications in crypto-currency market and image segmentation. In Chapter 3 a factor augmented dynamic model is proposed to analyse tail behavior of high-dimensional time series. This model extracts latent factors driven by tail events and examines their interaction with macroeconomic variables using VAR model. This methodology enables impulse-response analysis, out-of-sample predictions, and the study of network effects. The empirical study presents significant impact of financial tail event driven factors on macroeconomic variables during different economic periods. Chapter 4 is a pilot analysis on Non Fungible Tokens (NFTs) specifically CryptoPunks. The author investigates clustering among digital assets using various visualization techniques. The clusters identified through regression CNN and UMAP are associated with prices and traits of CryptoPunks. Chapter 5 introduces the construction of a price index called the Digital Art Index (DAI) for the NFT art market. The index is created using hedonic regression combined with robust estimators on the top 10 liquid NFT art collections. It proposes innovative procedures, namely Huberization and DCS-t filtering, to handle outlying price observations and create a robust index. Furthermore, it analyzes price determinants of the NFT market. Robuste Statistiken robuste Regression M-Schätzer Expektil Faktoranalyse VAR-Modell hochdimensionale Zeitreihen Huber-Schätzer Clustering hedonische Regression DCS-t-Filterung Robust statistics Robust regression M Estimator Expectile Factor analysis VAR model High dimensional time series Huber estimator Clustering Hedonic regression DCS-t filtering 332 Finanzwirtschaft QH 232 SK 840 QH 244 ddc:332

Search results