Global ETD Search

41	Robust mixture regression modeling with Pearson type VII distribution Zhang, Jingyi January 1900 (has links) Master of Science / Department of Statistics / Weixing Song / A robust estimation procedure for parametric regression models is proposed in the paper by assuming the error terms follow a Pearson type VII distribution. The estimation procedure is implemented by an EM algorithm based on the fact that the Pearson type VII distributions are a scale mixture of a normal distribution and a Gamma distribution. A trimmed version of proposed procedure is also discussed in this paper, which can successfully trim the high leverage points away from the data. Finite sample performance of the proposed algorithm is evaluated by some extensive simulation studies, together with the comparisons made with other existing procedures in the literature. Mixture model Pearson type VII distribution EM algorithm Robust regreesion Statistics (0463)
42	Robust mixture regression model fitting by Laplace distribution Xing, Yanru January 1900 (has links) Master of Science / Department of Statistics / Weixing Song / A robust estimation procedure for mixture linear regression models is proposed in this report by assuming the error terms follow a Laplace distribution. EM algorithm is imple- mented to conduct the estimation procedure of missing information based on the fact that the Laplace distribution is a scale mixture of normal and a latent distribution. Finite sample performance of the proposed algorithm is evaluated by some extensive simulation studies, together with the comparisons made with other existing procedures in this literature. A sensitivity study is also conducted based on a real data example to illustrate the application of the proposed method. EM algorithm Laplace distribution Least absolute deviation Mixture regression model Statistics (0463)
43	Minimum Hellinger distance estimation in a semiparametric mixture model Xiang, Sijia January 1900 (has links) Master of Science / Department of Statistics / Weixin Yao / In this report, we introduce the minimum Hellinger distance (MHD) estimation method and review its history. We examine the use of Hellinger distance to obtain a new efficient and robust estimator for a class of semiparametric mixture models where one component has known distribution while the other component and the mixing proportion are unknown. Such semiparametric mixture models have been used in biology and the sequential clustering algorithm. Our new estimate is based on the MHD, which has been shown to have good efficiency and robustness properties. We use simulation studies to illustrate the finite sample performance of the proposed estimate and compare it to some other existing approaches. Our empirical studies demonstrate that the proposed minimum Hellinger distance estimator (MHDE) works at least as well as some existing estimators for most of the examples considered and outperforms the existing estimators when the data are under contamination. A real data set application is also provided to illustrate the effectiveness of our proposed methodology. Semiparametric mixture models Minimum Hellinger distance Semiparametric EM algorithm Statistics (0463)
44	Predicting Hearing Loss Using Auditory Steady-State Responses li, yiwen 14 January 2009 (has links) Auditory Steady-State Response (ASSR) is a promising tool for detecting hearing loss. In this project, we analyzed hearing threshold data obtained from two ASSR methods and a gold standard, pure tone audiometry, applied to both normal and hearing-impaired subjects. We constructed a repeated measures linear model to identify factors that show significant differences in the mean response. The analysis shows that there are significant differences due to hearing status (normal or impaired) and ASSR method, and that there is a significant interaction between hearing status and test signal frequency. The second task of this project was to predict the PTA threshold (gold standard) from the ASSR-A and ASSR-B thresholds separately at each frequency, in order to measure how accurate the ASSR measurements are and to obtain a ¡°correction function¡± to correct the bias in the ASSR measurements. We used two approaches. In the first, we modeled the relation of the PTA responses to the ASSR values for the two hearing status groups as a mixture model and tried two prediction methods. The mixture modeling was successful, but the predictions gave disappointing results. A second approach, using logistic regression to predict group membership based on ASSR value and then using those predictions to obtain a predictor of the PTA value, gave successful results. ASSR PTA Mixed Model Repeated Measures Model EM algorithm Mixture Model Deafness Diagnosis Mathematical models
45	A Sensitivity Analysis of a Nonignorable Nonresponse Model Via EM Algorithm and Bootstrap Zong, Yujie 15 April 2011 (has links) The Slovenian Public Opinion survey (SPOS), which carried out in 1990, was used by the government of Slovenia as a benchmark to prepare for an upcoming plebiscite, which asked the respondents whether they support independence from Yugoslavia. However, the sample size was large and it is quite likely that the respondents and nonrespondents had divergent viewpoints. We first develop an ignorable nonresponse model which is an extension of a bivariate binomial model. In order to accommodate the nonrespondents, we then develop a nonignorable nonresponse model which is an extension of the ignorable model. Our methodology uses an EM algorithm to fit both the ignorable and nonignorable nonresponse models, and estimation is carried out using the bootstrap mechanism. We also perform sensitivity analysis to study different degrees of departures of the nonignorable nonresponse model from the ignorable nonresponse model. We found that the nonignorable nonresponse model is mildly sensitive to departures from the ignorable nonresponse model. In fact, our finding based on the nonignorable model is better than an earlier conclusion about another nonignorable nonresponse model fitted to these data. Bivariate binomial distribution Bootstrap EM algorithm Missing not at random Multinomial model 2X2 categorical tables
46	Influência local em modelos geoestatísticos T-Student com aplicações a dados agrícolas / Local influence in geoestatistic T-Student models applied to agricultural data Assumpção, Rosangela Aparecida Botinha 16 December 2010 (has links) Made available in DSpace on 2017-07-10T19:25:00Z (GMT). No. of bitstreams: 1 Rosangela_texto.pdf: 2310887 bytes, checksum: d9e69eaef22ee697283c66446001b19e (MD5) Previous issue date: 2010-12-16 / The presence of inconsistent observations make it improper to consider the gaussian process, as it is found in the literature. This process should be replaced by models of the symmetric distribution classes, such as the t-student distribution, which incorporates additional parameters to reduce the influence of inconsistent points. This work has developed the EM algorithm for estimating the structure of the spatial dependence of the parameters and of the spatial linear model, assuming that the process shows t-student n-varied distribution. This distribution has the degree of freedom v as the additional parameter, which has been considered to be fixed in this research. Techniques to diagnose influence are used after the estimation of parameters, in order to assess the quality of the adjustment of the model by the assumptions made and for the robustness of the results of the estimates when there are disturbances in the model or data. In the present work, diagnostic techniques for the assessment of local influence in linear spatial models have been developed, considering the process with t-student n-varied distribution. The usual diagnostic technique evaluates the withdrawing of the likelihood rate by the function of the likelihood logarithm. In this proposal, in addition to considering the usual technique, we use the withdrawing of the likelihood by Q-displacement of the complete likelihood. The application of the usual technique and of the one proposed here are illustrated through the analyses of both simulated and real data, provenient of agricultural experiments. / A presença de observações discrepantes torna imprópria a análise do processo gaussiano, sendo assim, como é encontrado na literatura, esse processo deve ser substituído por modelos da classe das distribuições simétricas, tal como a distribuição t-student, que incorpora parâmetros adicionais para reduzir a influência dos pontos discrepantes. Neste trabalho, assumiu-se que o processo apresenta distribuição t-student n-variada. Essa distribuição tem como parâmetro adicional o grau de liberdade v, que aqui considerou-se fixo. Dessa forma, desenvolveu-se o algoritmo EM e o algoritmo de NR para a estimação dos parâmetros da estrutura de dependência espacial e do modelo espacial linear. Após a estimação dos parâmetros, utilizou-se duas técnicas de diagnósticos de influência local, ambas com o intuito de avaliar a qualidade do ajuste do modelo pelas suposições feitas e pela robustez dos resultados das estimativas quando há perturbações no modelo ou nos dados. A primeira técnica, denominada "usual", já utilizada por diversos autores, avalia o afastamento da verossimilhança pela função do logaritmo da verossimilhança e a segunda técnica que aqui apresentamos propõe a análise de influência local pelo Q-afastamento da função de verossimilhança para dados completos. Essas técnicas permitiram verificar a influência no afastamento da verossimilhança, na matriz de covariância, no preditor linear e nos valores preditos por meio da análise gráfica. Para ilustrar a aplicação da técnica usual e da nossa proposta, realizou-se a análise de dados simulados e dados reais provenientes de experimentos agrícolas. Geoestatística Algoritmo EM Máxima verossimilhança Geostatistics EM Algorithm Maximum Likelihood
47	Influência local em modelos geoestatísticos T-Student com aplicações a dados agrícolas / Local influence in geoestatistic T-Student models applied to agricultural data Assumpção, Rosangela Aparecida Botinha 16 December 2010 (has links) Made available in DSpace on 2017-05-12T14:48:22Z (GMT). No. of bitstreams: 1 Rosangela_texto.pdf: 2310887 bytes, checksum: d9e69eaef22ee697283c66446001b19e (MD5) Previous issue date: 2010-12-16 / The presence of inconsistent observations make it improper to consider the gaussian process, as it is found in the literature. This process should be replaced by models of the symmetric distribution classes, such as the t-student distribution, which incorporates additional parameters to reduce the influence of inconsistent points. This work has developed the EM algorithm for estimating the structure of the spatial dependence of the parameters and of the spatial linear model, assuming that the process shows t-student n-varied distribution. This distribution has the degree of freedom v as the additional parameter, which has been considered to be fixed in this research. Techniques to diagnose influence are used after the estimation of parameters, in order to assess the quality of the adjustment of the model by the assumptions made and for the robustness of the results of the estimates when there are disturbances in the model or data. In the present work, diagnostic techniques for the assessment of local influence in linear spatial models have been developed, considering the process with t-student n-varied distribution. The usual diagnostic technique evaluates the withdrawing of the likelihood rate by the function of the likelihood logarithm. In this proposal, in addition to considering the usual technique, we use the withdrawing of the likelihood by Q-displacement of the complete likelihood. The application of the usual technique and of the one proposed here are illustrated through the analyses of both simulated and real data, provenient of agricultural experiments. / A presença de observações discrepantes torna imprópria a análise do processo gaussiano, sendo assim, como é encontrado na literatura, esse processo deve ser substituído por modelos da classe das distribuições simétricas, tal como a distribuição t-student, que incorpora parâmetros adicionais para reduzir a influência dos pontos discrepantes. Neste trabalho, assumiu-se que o processo apresenta distribuição t-student n-variada. Essa distribuição tem como parâmetro adicional o grau de liberdade v, que aqui considerou-se fixo. Dessa forma, desenvolveu-se o algoritmo EM e o algoritmo de NR para a estimação dos parâmetros da estrutura de dependência espacial e do modelo espacial linear. Após a estimação dos parâmetros, utilizou-se duas técnicas de diagnósticos de influência local, ambas com o intuito de avaliar a qualidade do ajuste do modelo pelas suposições feitas e pela robustez dos resultados das estimativas quando há perturbações no modelo ou nos dados. A primeira técnica, denominada "usual", já utilizada por diversos autores, avalia o afastamento da verossimilhança pela função do logaritmo da verossimilhança e a segunda técnica que aqui apresentamos propõe a análise de influência local pelo Q-afastamento da função de verossimilhança para dados completos. Essas técnicas permitiram verificar a influência no afastamento da verossimilhança, na matriz de covariância, no preditor linear e nos valores preditos por meio da análise gráfica. Para ilustrar a aplicação da técnica usual e da nossa proposta, realizou-se a análise de dados simulados e dados reais provenientes de experimentos agrícolas. Geoestatística Algoritmo EM Máxima verossimilhança Geostatistics EM Algorithm Maximum Likelihood
48	Model selection criteria in the presence of missing data based on the Kullback-Leibler discrepancy Sparks, JonDavid 01 December 2009 (has links) An important challenge in statistical modeling involves determining an appropriate structural form for a model to be used in making inferences and predictions. Missing data is a very common occurrence in most research settings and can easily complicate the model selection problem. Many useful procedures have been developed to estimate parameters and standard errors in the presence of missing data;however, few methods exist for determining the actual structural form of a modelwhen the data is incomplete. In this dissertation, we propose model selection criteria based on the Kullback-Leiber discrepancy that can be used in the presence of missing data. The criteria are developed by accounting for missing data using principles related to the expectation maximization (EM) algorithm and bootstrap methods. We formulate the criteria for three specific modeling frameworks: for the normal multivariate linear regression model, a generalized linear model, and a normal longitudinal regression model. In each framework, a simulation study is presented to investigate the performance of the criteria relative to their traditional counterparts. We consider a setting where the missingness is confined to the outcome, and also a setting where the missingness may occur in the outcome and/or the covariates. The results from the simulation studies indicate that our criteria provide better protection against underfitting than their traditional analogues. We outline the implementation of our methodology for a general discrepancy measure. An application is presented where the proposed criteria are utilized in a study that evaluates the driving performance of individuals with Parkinson's disease under low contrast (fog) conditions in a driving simulator. AIC Bootstrap EM Algorithm Kullback-Leibler discrepancy Missing Data Model Selection Biostatistics
49	Mixtures-of-Regressions with Measurement Error Fang, Xiaoqiong 01 January 2018 (has links) Finite Mixture model has been studied for a long time, however, traditional methods assume that the variables are measured without error. Mixtures-of-regression model with measurement error imposes challenges to the statisticians, since both the mixture structure and the existence of measurement error can lead to inconsistent estimate for the regression coefficients. In order to solve the inconsistency, We propose series of methods to estimate the mixture likelihood of the mixtures-of-regressions model when there is measurement error, both in the responses and predictors. Different estimators of the parameters are derived and compared with respect to their relative efficiencies. The simulation results show that the proposed estimation methods work well and improve the estimating process. mixtures-of-regression measurement error EM algorithm Poisson regression Applied Statistics Statistical Models
50	A Flexible Zero-Inflated Poisson Regression Model Roemmele, Eric S. 01 January 2019 (has links) A practical problem often encountered with observed count data is the presence of excess zeros. Zero-inflation in count data can easily be handled by zero-inflated models, which is a two-component mixture of a point mass at zero and a discrete distribution for the count data. In the presence of predictors, zero-inflated Poisson (ZIP) regression models are, perhaps, the most commonly used. However, the fully parametric ZIP regression model could sometimes be restrictive, especially with respect to the mixing proportions. Taking inspiration from some of the recent literature on semiparametric mixtures of regressions models for flexible mixture modeling, we propose a semiparametric ZIP regression model. We present an "EM-like" algorithm for estimation and a summary of asymptotic properties of the estimators. The proposed semiparametric models are then applied to a data set involving clandestine methamphetamine laboratories and Alzheimer's disease. Bootstrap Count data EM Algorithm zero-inflation semiparametric model Statistical Models Statistical Theory

Search results