11 |
Automated Detection of Surface Defects on Barked Hardwood Logs and Stems Using 3-D Laser Scanned DataThomas, Liya 15 November 2006 (has links)
This dissertation presents an automated detection algorithm that identifies severe external defects on the surfaces of barked hardwood logs and stems. The defects detected are at least 0.5 inch in height and at least 3 inches in diameter, which are severe, medium to large in size, and have external surface rises. Hundreds of real log defect samples were measured, photographed, and categorized to summarize the main defect features and to build a defect knowledge base. Three-dimensional laser-scanned range data capture the external log shapes and portray bark pattern, defective knobs, and depressions.
The log data are extremely noisy, have missing data, and include severe outliers induced by loose bark that dangles from the log trunk. Because the circle model is nonlinear and presents both additive and non-additive errors, a new robust generalized M-estimator has been developed that is different from the ones proposed in the statistical literature for linear regression. Circle fitting is performed by standardizing the residuals via scale estimates calculated by means of projection statistics and incorporated in the Huber objective function to bound the influence of the outliers in the estimates. The projection statistics are based on 2-D radial-vector coordinates instead of the row vectors of the Jacobian matrix as proposed in the statistical literature dealing with linear regression. This approach proves effective in that it makes the GM-estimator to be influence bounded and thereby, robust against outliers.
Severe defects are identified through the analysis of 3-D log data using decision rules obtained from analyzing the knowledge base. Contour curves are generated from radial distances, which are determined by robust 2-D circle fitting to the log-data cross sections. The algorithm detected 63 from a total of 68 severe defects. There were 10 non-defective regions falsely identified as defects. When these were calculated as areas, the algorithm locates 97.6% of the defect area, and falsely identifies 1.5% of the total clear area as defective. / Ph. D.
|
12 |
Cyclostationarity Feature-Based Detection and ClassificationMalady, Amy Colleen 25 May 2011 (has links)
Cyclostationarity feature-based (C-FB) detection and classification is a large field of research that has promising applications to intelligent receiver design. Cyclostationarity FB classification and detection algorithms have been applied to a breadth of wireless communication signals — analog and digital alike. This thesis reports on an investigation of existing methods of extracting cyclostationarity features and then presents a novel robust solution that reduces SNR requirements, removes the pre-processing task of estimating occupied signal bandwidth, and can achieve classification rates comparable to those achieved by the traditional method while based on only 1/10 of the observation time. Additionally, this thesis documents the development of a novel low order consideration of the cyclostationarity present in Continuous Phase Modulation (CPM) signals, which is more practical than using higher order cyclostationarity.
Results are presented — through MATLAB simulation — that demonstrate the improvements enjoyed by FB classifiers and detectors when using robust methods of estimating cyclostationarity. Additionally, a MATLAB simulation of a CPM C-FB detector confirms that low order C-FB detection of CPM signals is possible. Finally, suggestions for further research and contribution are made at the conclusion of the thesis. / Master of Science
|
13 |
A Differential Geometry-Based Algorithm for Solving the Minimum Hellinger Distance EstimatorD'Ambrosio, Philip 28 May 2008 (has links)
Robust estimation of statistical parameters is traditionally believed to exist in a trade space between robustness and efficiency. This thesis examines the Minimum Hellinger Distance Estimator (MHDE), which is known to have desirable robustness properties as well as desirable efficiency properties. This thesis confirms that the MHDE is simultaneously robust against outliers and asymptotically efficient in the univariate location case. Robustness results are then extended to the case of simple linear regression, where the MHDE is shown empirically to have a breakdown point of 50%. A geometric algorithm for solution of the MHDE is developed and implemented. The algorithm utilizes the Riemannian manifold properties of the statistical model to achieve an algorithmic speedup. The MHDE is then applied to an illustrative problem in power system state estimation. The power system is modeled as a structured linear regression problem via a linearized direct current model; robustness results in this context have been investigated and future research areas have been identified from both a statistical perspective as well as an algorithm design standpoint. / Master of Science
|
14 |
Robust and Data-Driven Uncertainty Quantification Methods as Real-Time Decision Support in Data-Driven ModelsAlgikar, Pooja Basavaraj 05 February 2025 (has links)
The growing complexity and data in modern engineering and physical systems require robust frameworks for real-time decision-making. Data-driven models trained on observational data enable faster predictions but face key challenges—data corruption, bias, limited interpretability, and uncertainty misrepresentation—which can compromise their reliability. Propagating uncertainties from sources like model parameters and input features is crucial in data-driven models to ensure trustworthy predictions and informed decisions. Uncertainty quantification (UQ) methods are broadly categorized into surrogate-based models, which approximate simulators for speed and efficiency, and probabilistic approaches, such as Bayesian models and Gaussian processes, that inherently capture uncertainty into predictions. For real-time UQ, leveraging recent data instead of historical records enables more accurate and efficient uncertainty characterization, making it inherently data-driven. In dynamical analysis, the Koopman operator represents nonlinear system dynamics as linear systems by lifting state functions, enabling data-driven estimation through its applied form. By analyzing its spectral properties—eigenvalues, eigenfunctions, and modes—the Koopman operator reveals key insights into system dynamics and simplifies control design. However, inherent measurement uncertainty poses challenges for efficient estimation with dynamic mode and extended dynamic mode decomposition algorithms. This dissertation develops a statistical framework to propagate measurement uncertainties in the elements of the Koopman operator. This dissertation also develops robust estimation of model parameters, considering observational data, which is often corrupted, in Gaussian process settings. The proposed approaches adapt to evolving data and process agnostic— in which reliance on predefined source distributions is avoided. / Doctor of Philosophy / Modern engineering and scientific systems are increasingly complex and interconnected— operating in environments with significant uncertainties and dynamic changes. Traditional mathematical models and simulations often fall short in capturing the complexity of largescale real-world, ever-evolving systems—struggling to adapt to dynamic changes and fully utilize today's data-rich environments. This is especially critical in fields like renewable integrated power systems, robotics, etc., where real-time decisions must account for uncertainties in the environment, measurements, and operations. The growing availability of observational data—enabled by advanced sensors and computational tools—has driven a shift toward data-driven approaches. Unlike traditional simulators, these models are faster and learn directly from data. However, their reliability depends on robust methods to quantify and manage uncertainties, as corrupted data, biases, and measurement noise challenge their accuracy. This dissertation focuses on characterizing uncertainties at the source using recent data, instead of relying on assumed distributions or historical data, as is common in the literature. Given that observational data is often corrupted by outliers, this dissertation also develops robust parameter estimation within the Gaussian process setting. A central focus is the Koopman operator theory—a transformative framework that converts complex, nonlinear systems into simpler, linear representations. This research integrates measurement uncertainty quantification into Koopman-based models, providing a metric to assess the reliability of the Koopman operator under measurement noise.
|
15 |
Robust mixtures of regression modelsBai, Xiuqin January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Kun Chen and Weixin Yao / This proposal contains two projects that are related to robust mixture models. In the robust project,
we propose a new robust mixture of regression models (Bai et al., 2012). The existing methods for tting
mixture regression models assume a normal distribution for error and then estimate the regression param-
eters by the maximum likelihood estimate (MLE). In this project, we demonstrate that the MLE, like the
least squares estimate, is sensitive to outliers and heavy-tailed error distributions. We propose a robust
estimation procedure and an EM-type algorithm to estimate the mixture regression models. Using a Monte
Carlo simulation study, we demonstrate that the proposed new estimation method is robust and works
much better than the MLE when there are outliers or the error distribution has heavy tails. In addition, the
proposed robust method works comparably to the MLE when there are no outliers and the error is normal.
In the second project, we propose a new robust mixture of linear mixed-effects models. The traditional
mixture model with multiple linear mixed effects, assuming Gaussian distribution for random and error
parts, is sensitive to outliers. We will propose a mixture of multiple linear mixed t-distributions to robustify
the estimation procedure. An EM algorithm is provided to and the MLE under the assumption of t-
distributions for error terms and random mixed effects. Furthermore, we propose to adaptively choose the
degrees of freedom for the t-distribution using profile likelihood. In the simulation study, we demonstrate
that our proposed model works comparably to the traditional estimation method when there are no outliers
and the errors and random mixed effects are normally distributed, but works much better if there are outliers
or the distributions of the errors and random mixed effects have heavy tails.
|
16 |
Robust mixture linear EIV regression models by t-distributionLiu, Yantong January 1900 (has links)
Master of Science / Department of Statistics / Weixing Song / A robust estimation procedure for mixture errors-in-variables linear regression models is proposed in the report by assuming the error terms follow a t-distribution. The estimation procedure is implemented by an EM algorithm based on the fact that the t-distribution is a scale mixture of normal distribution and a Gamma distribution. Finite sample performance of the proposed algorithm is evaluated by some extensive simulation studies. Comparison is also made with the MLE procedure under normality assumption.
|
17 |
Modelos elípticos multiníveis / Multilevel elliptical modelsManghi, Roberto Ferreira 08 December 2011 (has links)
Os modelos multiníveis representam uma classe de modelos utilizada para ajustes de dados que apresentam estrutura de hierarquia. O presente trabalho propõe uma generalizacão dos modelos normais multiníveis, denominada modelos elípticos multiníveis. Esta proposta sugere o uso de distribuicões de probabilidade pertencentes à classe elíptica, envolvendo portanto todas as distribuições contínuas simétricas, incluindo a distribuição normal como caso particular. As distribuições elípticas podem apresentar caudas mais leves ou mais pesadas que as caudas da distribuição normal. No caso da presença de observações aberrantes, é sugerido o uso de distribuições com caudas pesadas no intuito de obter um melhor ajuste do modelo aos dados considerados discrepantes. Nesta dissertação, alguns aspectos dos modelos elípticos multiníveis são desenvolvidos, como o processo de estimação dos parâmetros via máxima verossimilhança, testes de hipóteses para os efeitos fixos e parâmetros de variância e covariância e análise de resíduos para verificação de características relacionadas aos ajustes e às suposições estabelecidas. / Multilevel models represent a class of models used to adjust data which have hierarchical structure. The present work proposes a generalization of the multilevel normal models, named multilevel elliptical models. This proposal suggests the use of probability distributions belonging to the elliptical class, thus involving all symmetric continuous distributions, including the normal distribution as a particular case. Elliptical distributions may have lighter or heavier tails than the normal ones. In case of presence of outlying observations, it is suggested the use of heavy-tailed distributions in order to obtain a better fitted model to the discrepant observations. In this dissertation some aspects of the multilevel elliptical models are developed, such as the process of parameter estimation by maximum likelihood, hypothesis tests for fixed effects and variance-covariance parameters and residual analysis to check features related to the fitting and established assumptions.
|
18 |
A New Generation of Mixture-Model Cluster Analysis with Information Complexity and the Genetic EM AlgorithmHowe, John Andrew 01 May 2009 (has links)
In this dissertation, we extend several relatively new developments in statistical model selection and data mining in order to improve one of the workhorse statistical tools - mixture modeling (Pearson, 1894). The traditional mixture model assumes data comes from several populations of Gaussian distributions. Thus, what remains is to determine how many distributions, their population parameters, and the mixing proportions. However, real data often do not fit the restrictions of normality very well. It is likely that data from a single population exhibiting either asymmetrical or nonnormal tail behavior could be erroneously modeled as two populations, resulting in suboptimal decisions. To avoid these pitfalls, we develop the mixture model under a broader distributional assumption by fitting a group of multivariate elliptically-contoured distributions (Anderson and Fang, 1990; Fang et al., 1990). Special cases include the multivariate Gaussian and power exponential distributions, as well as the multivariate generalization of the Student’s T. This gives us the flexibility to model nonnormal tail and peak behavior, though the symmetry restriction still exists. The literature has many examples of research generalizing the Gaussian mixture model to other distributions (Farrell and Mersereau, 2004; Hasselblad, 1966; John, 1970a), but our effort is more general. Further, we generalize the mixture model to be non-parametric, by developing two types of kernel mixture model. First, we generalize the mixture model to use the truly multivariate kernel density estimators (Wand and Jones, 1995). Additionally, we develop the power exponential product kernel mixture model, which allows the density to adjust to the shape of each dimension independently. Because kernel density estimators enforce no functional form, both of these methods can adapt to nonnormal asymmetric, kurtotic, and tail characteristics. Over the past two decades or so, evolutionary algorithms have grown in popularity, as they have provided encouraging results in a variety of optimization problems. Several authors have applied the genetic algorithm - a subset of evolutionary algorithms - to mixture modeling, including Bhuyan et al. (1991), Krishna and Murty (1999), and Wicker (2006). These procedures have the benefit that they bypass computational issues that plague the traditional methods. We extend these initialization and optimization methods by combining them with our updated mixture models. Additionally, we “borrow” results from robust estimation theory (Ledoit and Wolf, 2003; Shurygin, 1983; Thomaz, 2004) in order to data-adaptively regularize population covariance matrices. Numerical instability of the covariance matrix can be a significant problem for mixture modeling, since estimation is typically done on a relatively small subset of the observations. We likewise extend various information criteria (Akaike, 1973; Bozdogan, 1994b; Schwarz, 1978) to the elliptically-contoured and kernel mixture models. Information criteria guide model selection and estimation based on various approximations to the Kullback-Liebler divergence. Following Bozdogan (1994a), we use these tools to sequentially select the best mixture model, select the best subset of variables, and detect influential observations - all without making any subjective decisions. Over the course of this research, we developed a full-featured Matlab toolbox (M3) which implements all the new developments in mixture modeling presented in this dissertation. We show results on both simulated and real world datasets. Keywords: mixture modeling, nonparametric estimation, subset selection, influence detection, evidence-based medical diagnostics, unsupervised classification, robust estimation.
|
19 |
Analysis of independent motion detection in 3D scenesFloren, Andrew William 30 October 2012 (has links)
In this thesis, we develop an algorithm for detecting independent motion in real-time from 2D image sequences of arbitrarily complex 3D scenes. We discuss the necessary background information in image formation, optical flow, multiple view geometry, robust estimation, and real-time camera and scene pose estimation for constructing and understanding the operation of our algorithm. Furthermore, we provide an overview of existing independent motion detection techniques and compare them to our proposed solution. Unfortunately, the existing independent motion detection techniques were not evaluated quantitatively nor were their source code made publicly available. Therefore, it is not possible to make direct comparisons. Instead, we constructed several comparison algorithms which should have comparable performance to these previous approaches. We developed methods for quantitatively comparing independent motion detection algorithms and found that our solution had the best performance. By establishing a method for quantitatively evaluating these algorithms and publishing our results, we hope to foster better research in this area and help future investigators more quickly advance the state of the art. / text
|
20 |
Empirical Likelihood Confidence Intervals for the Population Mean Based on Incomplete DataValdovinos Alvarez, Jose Manuel 09 May 2015 (has links)
The use of doubly robust estimators is a key for estimating the population mean response in the presence of incomplete data. Cao et al. (2009) proposed an alternative doubly robust estimator which exhibits strong performance compared to existing estimation methods. In this thesis, we apply the jackknife empirical likelihood, the jackknife empirical likelihood with nuisance parameters, the profile empirical likelihood, and an empirical likelihood method based on the influence function to make an inference for the population mean. We use these methods to construct confidence intervals for the population mean, and compare the coverage probabilities and interval lengths using both the ``usual'' doubly robust estimator and the alternative estimator proposed by Cao et al. (2009). An extensive simulation study is carried out to compare the different methods. Finally, the proposed methods are applied to two real data sets.
|
Page generated in 0.0974 seconds