Spelling suggestions: "subject:"EM algorithm."" "subject:"EM allgorithm.""
31 |
Robust mixture modelingYu, Chun January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Weixin Yao and Kun Chen / Ordinary least-squares (OLS) estimators for a linear model are very sensitive to unusual
values in the design space or outliers among y values. Even one single atypical value may have a large effect on the parameter estimates. In this proposal, we first review and describe some available and popular robust techniques, including some recent developed ones, and compare them in terms of breakdown point and efficiency. In addition, we also use a simulation study and a real data application to compare the performance of existing robust methods under different scenarios. Finite mixture models are widely applied in a variety of random phenomena. However, inference of mixture models is a challenging work when the outliers exist in the data. The traditional maximum likelihood estimator (MLE) is sensitive to outliers. In this proposal, we propose a Robust Mixture via Mean shift penalization (RMM) in mixture models and Robust Mixture Regression via Mean shift penalization (RMRM) in mixture regression, to achieve simultaneous outlier detection and parameter estimation. A mean shift parameter is added to the mixture models, and penalized by a nonconvex penalty function. With this model setting, we develop an iterative thresholding embedded EM algorithm to maximize the penalized objective function. Comparing with other existing robust methods, the proposed methods show outstanding performance in both identifying outliers and estimating the parameters.
|
32 |
Robust mixtures of regression modelsBai, Xiuqin January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Kun Chen and Weixin Yao / This proposal contains two projects that are related to robust mixture models. In the robust project,
we propose a new robust mixture of regression models (Bai et al., 2012). The existing methods for tting
mixture regression models assume a normal distribution for error and then estimate the regression param-
eters by the maximum likelihood estimate (MLE). In this project, we demonstrate that the MLE, like the
least squares estimate, is sensitive to outliers and heavy-tailed error distributions. We propose a robust
estimation procedure and an EM-type algorithm to estimate the mixture regression models. Using a Monte
Carlo simulation study, we demonstrate that the proposed new estimation method is robust and works
much better than the MLE when there are outliers or the error distribution has heavy tails. In addition, the
proposed robust method works comparably to the MLE when there are no outliers and the error is normal.
In the second project, we propose a new robust mixture of linear mixed-effects models. The traditional
mixture model with multiple linear mixed effects, assuming Gaussian distribution for random and error
parts, is sensitive to outliers. We will propose a mixture of multiple linear mixed t-distributions to robustify
the estimation procedure. An EM algorithm is provided to and the MLE under the assumption of t-
distributions for error terms and random mixed effects. Furthermore, we propose to adaptively choose the
degrees of freedom for the t-distribution using profile likelihood. In the simulation study, we demonstrate
that our proposed model works comparably to the traditional estimation method when there are no outliers
and the errors and random mixed effects are normally distributed, but works much better if there are outliers
or the distributions of the errors and random mixed effects have heavy tails.
|
33 |
Statistical Methods for Handling Intentional Inaccurate RespondersMcQuerry, Kristen J. 01 January 2016 (has links)
In self-report data, participants who provide incorrect responses are known as intentional inaccurate responders. This dissertation provides statistical analyses for address intentional inaccurate responses in the data.
Previous work with adolescent self-report, labeled survey participants who intentionally provide inaccurate answers as mischievous responders. This phenomenon also occurs in clinical research. For example, pregnant women who smoke may report that they are nonsmokers. Our advantage is that we do not solely have self-report answers and can verify responses with lab values. Currently, there is no clear method for handling these intentional inaccurate respondents when it comes to making statistical inferences.
We propose a using an EM algorithm to account for the intentional behavior while maintaining all responses in the data. The performance of this model is evaluated using simulated data and real data. The strengths and weaknesses of the EM algorithm approach will be demonstrated.
|
34 |
Robust Diagnostics for the Logistic Regression Model With Incomplete Data范少華 Unknown Date (has links)
Atkinson 及 Riani 應用前進搜尋演算法來處理百牡利資料中所包含的多重離群值(2001)。在這篇論文中,我們沿用相同的想法來處理在不完整資料下一般線性模型中的多重離群值。這個演算法藉由先填補資料中遺漏的部分,再利用前進搜尋演算法來確認資料中的離群值。我們所提出的方法可以解決處理多重離群值時常會遇到的遮蓋效應。我們應用了一些真實資料來說明這個演算法並得到令人滿意結果。 / Atkinson and Riani (2001) apply the forward search algorithm to deal with the problem of the detection of multiple outliers in binomial data.
In this thesis, we extend the similar idea to identify multiple outliers for the generalized linear models when part of data are missing. The algorithm starts with imputation method to
fill-in the missing observations in the data, and then use the forward search algorithm to confirm outliers. The proposed method can overcome the masking effect, which commonly occurs when multiple outliers exit in the data. Real data are used to illustrate the procedure, and satisfactory results are obtained.
|
35 |
Modelling human immunodeficiency virus ribonucleic acid levels with finite mixtures for censored longitudinal dataGrün, Bettina, Hornik, Kurt 01 1900 (has links) (PDF)
The measurement of human immunodeficiency virus ribonucleic acid levels over time
leads to censored longitudinal data. Suitable models for dynamic modelling of these levels need
to take this data characteristic into account. If groups of patients with different developments of
the levels over time are suspected the model class of finite mixtures of mixed effects models
with censored data is required.We describe the model specification and derive the estimation
with a suitable expectation-maximization algorithm.We propose a convenient implementation
using closed form formulae for the expected mean and variance of the truncated multivariate
distribution. Only efficient evaluation of the cumulative multivariate normal distribution function
is required. Model selection as well as methods for inference are discussed. The application is
demonstrated on the clinical trial ACTG 315 data.
|
36 |
LATENT VARIABLE MODELS GIVEN INCOMPLETELY OBSERVED SURROGATE OUTCOMES AND COVARIATESRen, Chunfeng 01 January 2014 (has links)
Latent variable models (LVMs) are commonly used in the scenario where the outcome of the main interest is an unobservable measure, associated with multiple observed surrogate outcomes, and affected by potential risk factors. This thesis develops an approach of efficient handling missing surrogate outcomes and covariates in two- and three-level latent variable models. However, corresponding statistical methodologies and computational software are lacking efficiently analyzing the LVMs given surrogate outcomes and covariates subject to missingness in the LVMs. We analyze the two-level LVMs for longitudinal data from the National Growth of Health Study where surrogate outcomes and covariates are subject to missingness at any of the levels. A conventional method for efficient handling of missing data is to reexpress the desired model as a joint distribution of variables, including the surrogate outcomes that are subject to missingness conditional on all of the covariates that are completely observable, and estimate the joint model by maximum likelihood, which is then transformed to the desired model. The joint model, however, identifies more parameters than desired, in general. The over-identified joint model produces biased estimates of LVMs so that it is most necessary to describe how to impose constraints on the joint model so that it has a one-to-one correspondence with the desired model for unbiased estimation. The constrained joint model handles missing data efficiently under the assumption of ignorable missing data and is estimated by a modified application of the expectation-maximization (EM) algorithm.
|
37 |
A Normal-Mixture Model with Random-Effects for RR-Interval DataKetchum, Jessica McKinney 01 January 2006 (has links)
In many applications of random-effects models to longitudinal data, such as heart rate variability (HRV) data, a normal-mixture distribution seems to be more appropriate than the normal distribution assumption. While the random-effects methodology is well developed for several distributions in the exponential family, the case of the normal-mixture has not been dealt with adequately in the literature. The models and the estimation methods that have been proposed in the past assume the conditional model (fixing the random-effects) to be normal and allow a mixture distribution for the random effects (Xu and Hedeker, 2001, Xu, 1995). The methods proposed in this dissertation assume the conditional model to be a normal-mixture while the random-effects are assumed to be normal. This is primarily to fit the HRV data, which seems to follow a normal-mixture within subjects. Another advantage of this model is that the estimation becomes much simpler through the use of an EM-algorithm. Existing methods and software such as the PROC MIXED in SAS are exploited to facilitate the estimation procedure.A simulation study is performed to examine the properties of the random-effects model with normal-mixture distribution and the estimation of the parameters using the EM-algorithm. The study shows that the estimates have similar properties to the usual normal random-effects models. The between subject variance parameter seems to require larger numbers of subjects to achieve reasonable accuracy, which is typical in all random-effects models.The HRV data is used to illustrate the random-effects normal-mixture method. These data consist of 9 subjects who completed a series of five speech tasks (Cacioppo et al., 2002). For each of the tasks, a series of RR-intervals was collected during baseline, preparation, and delivery periods. Information about their age and gender were also available. The random-effects mixture model presented in this dissertation treats the subjects as random and models age, gender, task, type, and task × type as fixed-effects. The analysis leads to the conclusion that all the fixed effects are statistically significant. The model further indicates a two-component normal-mixture with the same mixture proportion across individuals fit the data adequately.
|
38 |
Modely pro přežití s možností vyléčení / Cure-rate modelsDrabinová, Adéla January 2016 (has links)
In this work we deal with survival models, when we consider that with positive probability some patients never relapse because they are cured. We focus on two-component mixture model and model with biological motivation. For each model, we derive estimate of probability of cure and estimate of survival function of time to relaps of uncured patients by maximum likelihood method. Further we consider, that both probability of cure and survival time can depend on regressors. Models are then compared through simulation study. 1
|
39 |
Robust mixture regression models using t-distributionWei, Yan January 1900 (has links)
Master of Science / Department of Statistics / Weixin Yao / In this report, we propose a robust mixture of regression based on t-distribution by
extending the mixture of t-distributions proposed by Peel and McLachlan (2000) to the
regression setting. This new mixture of regression model is robust to outliers in y direction but not robust to the outliers with high leverage points. In order to combat this, we also propose a modified version of the proposed method, which fits the mixture of regression based on t-distribution to the data after adaptively trimming the high leverage points. We
further propose to adaptively choose the degree of freedom for the t-distribution using profile likelihood. The proposed robust mixture regression estimate has high efficiency due to the adaptive choice of degree of freedom. We demonstrate the effectiveness of the proposed new method and compare it with some of the existing methods through simulation study.
|
40 |
Robust mixture linear EIV regression models by t-distributionLiu, Yantong January 1900 (has links)
Master of Science / Department of Statistics / Weixing Song / A robust estimation procedure for mixture errors-in-variables linear regression models is proposed in the report by assuming the error terms follow a t-distribution. The estimation procedure is implemented by an EM algorithm based on the fact that the t-distribution is a scale mixture of normal distribution and a Gamma distribution. Finite sample performance of the proposed algorithm is evaluated by some extensive simulation studies. Comparison is also made with the MLE procedure under normality assumption.
|
Page generated in 0.0254 seconds