Global ETD Search

1	A score test of homogeneity in generalized additive models for zero-inflated count data Nian, Gaowei January 1900 (has links) Master of Science / Department of Statistics / Wei-Wen Hsu / Zero-Inflated Poisson (ZIP) models are often used to analyze the count data with excess zeros. In the ZIP model, the Poisson mean and the mixing weight are often assumed to depend on covariates through regression technique. In other words, the effect of covariates on Poisson mean or the mixing weight is specified using a proper link function coupled with a linear predictor which is simply a linear combination of unknown regression coefficients and covariates. However, in practice, this predictor may not be linear in regression parameters but curvilinear or nonlinear. Under such situation, a more general and flexible approach should be considered. One popular method in the literature is Zero-Inflated Generalized Additive Models (ZIGAM) which extends the zero-inflated models to incorporate the use of Generalized Additive Models (GAM). These models can accommodate the nonlinear predictor in the link function. For ZIGAM, it is also of interest to conduct inferences for the mixing weight, particularly evaluating whether the mixing weight equals to zero. Many methodologies have been proposed to examine this question, but all of them are developed under classical zero-inflated models rather than ZIGAM. In this report, we propose a generalized score test to evaluate whether the mixing weight is equal to zero under the framework of ZIGAM with Poisson model. Technically, the proposed score test is developed based on a novel transformation for the mixing weight coupled with proportional constraints on ZIGAM, where it assumes that the smooth components of covariates in both the Poisson mean and the mixing weight have proportional relationships. An intensive simulation study indicates that the proposed score test outperforms the other existing tests when the mixing weight and the Poisson mean truly involve a nonlinear predictor. The recreational fisheries data from the Marine Recreational Information Program (MRIP) survey conducted by National Oceanic and Atmospheric Administration (NOAA) are used to illustrate the proposed methodology. Score test Statistics (0463)
2	Secondary Analysis of Case-Control Studies in Genomic Contexts Wei, Jiawei 2010 August 1900 (has links) This dissertation consists of five independent projects. In each project, a novel statistical method was developed to address a practical problem encountered in genomic contexts. For example, we considered testing for constant nonparametric effects in a general semiparametric regression model in genetic epidemiology; analyzed the relationship between covariates in the secondary analysis of case-control data; performed model selection in joint modeling of paired functional data; and assessed the prediction ability of genes in gene expression data generated by the CodeLink System from GE. In the first project in Chapter II we considered the problem of testing for constant nonparametric effects in a general semiparametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. We derived a generalized likelihood ratio test for this hypothesis, showed how to implement it, and gave evidence that it can improve statistical power when compared to standard partially linear models. The second project in Chapter III addressed the issue of score testing for the independence of X and Y in the second analysis of case-control data. The semiparametric efficient approaches can be used to construct semiparametric score tests, but they suffer from a lack of robustness to the assumed model for Y given X. We showed how to adjust the semiparametric score test to make its level/Type I error correct even if the assumed model for Y given X is incorrect, and thus the test is robust. The third project in Chapter IV took up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model. We showed how to estimate the regression parameters in a rare disease case even if the assumed model for Y given X is incorrect, and thus the estimates are model-robust. In the fourth project in Chapter V we developed novel AIC and BIC-type methods for estimating the smoothing parameters in a joint model of paired, hierarchical sparse functional data, and showed in our numerical work that they are many times faster than 10-fold crossvalidation while at the same time giving results that are remarkably close to the crossvalidated estimates. In the fifth project in Chapter VI we introduced a practical permutation test that uses cross-validated genetic predictors to determine if the list of genes in question has “good” prediction ability. It avoids overfitting by using cross-validation to derive the genetic predictor and determines if the count of genes that give “good” prediction could have been obtained by chance. This test was then used to explore gene expression of colonic tissue and exfoliated colonocytes in the fecal stream to discover similarities between the two. Semiparametric Regression Case-Control Score Test Model Selection Classification
3	The impact of misspecification of nuisance parameters on test for homogeneity in zero-inflated Poisson model: a simulation study Gao, Siyu January 1900 (has links) Master of Science / Department of Statistics / Wei-Wen Hsu / The zero-inflated Poisson (ZIP) model consists of a Poisson model and a degenerate distribution at zero. Under this model, zero counts are generated from two sources, representing a heterogeneity in the population. In practice, it is often interested to evaluate this heterogeneity is consistent with the observed data or not. Most of the existing methodologies to examine this heterogeneity are often assuming that the Poisson mean is a function of nuisance parameters which are simply the coefficients associated with covariates. However, these nuisance parameters can be misspecified when performing these methodologies. As a result, the validity and the power of the test may be affected. Such impact of misspecification has not been discussed in the literature. This report primarily focuses on investigating the impact of misspecification on the performance of score test for homogeneity in ZIP models. Through an intensive simulation study, we find that: 1) under misspecification, the limiting distribution of the score test statistic under the null no longer follows a chi-squared distribution. A parametric bootstrap methodology is suggested to use to find the true null limiting distribution of the score test statistic; 2) the power of the test decreases as the number of covariates in the Poisson mean increases. The test with a constant Poisson mean has the highest power, even compared to the test with a well-specified mean. At last, simulation results are applied to the Wuhan Inpatient Care Insurance data which contain excess zeros. zero-inflated Poisson model score test misspecification nuisance parameter Statistics (0463)
4	Score Test and Likelihood Ratio Test for Zero-Inflated Binomial Distribution and Geometric Distribution Dai, Xiaogang 01 April 2018 (has links) The main purpose of this thesis is to compare the performance of the score test and the likelihood ratio test by computing type I errors and type II errors when the tests are applied to the geometric distribution and inflated binomial distribution. We first derive test statistics of the score test and the likelihood ratio test for both distributions. We then use the software package R to perform a simulation to study the behavior of the two tests. We derive the R codes to calculate the two types of error for each distribution. We create lots of samples to approximate the likelihood of type I error and type II error by changing the values of parameters. In the first chapter, we discuss the motivation behind the work presented in this thesis. Also, we introduce the definitions used throughout the paper. In the second chapter, we derive test statistics for the likelihood ratio test and the score test for the geometric distribution. For the score test, we consider the score test using both the observed information matrix and the expected information matrix, and obtain the score test statistic zO and zI . Chapter 3 discusses the likelihood ratio test and the score test for the inflated binomial distribution. The main parameter of interest is w, so p is a nuisance parameter in this case. We derive the likelihood ratio test statistics and the score test statistics to test w. In both tests, the nuisance parameter p is estimated using maximum likelihood estimator pˆ. We also consider the score test using both the observed and the expected information matrices. Chapter 4 focuses on the score test in the inflated binomial distribution. We generate data to follow the zero inflated binomial distribution by using the package R. We plot the graph of the ratio of the two score test statistics for the sample data, zI /zO , in terms of different values of n0, the number of zero values in the sample. In chapter 5, we discuss and compare the use of the score test using two types of information matrices. We perform a simulation study to estimate the two types of errors when applying the test to the geometric distribution and the inflated binomial distribution. We plot the percentage of the two errors by fixing different parameters, such as the probability p and the number of trials m. Finally, we conclude by briefly summarizing the results in chapter 6. Rao's score test type I error type II error Applied Statistics Other Applied Mathematics Probability
5	The Box-Cox Transformation:A Review 曾能芳, Zeng, Neng-Fang Unknown Date (has links) The use of transformation can usually simplify the analysis of data, especially when the original observations deviate from the underlying assumption of linear model. Box-Cox transformation receives much more attention than others. In this dissertation,. we will review the theory about the estimation, hypotheses test on transformation parameter and about the sensitivity of the linear model parameters in Box-Cox transformation. Monte Carlo simulation is used to study the performance of the transformations. We also display whether Box-Cox transformation make the transformed observations satisfy the assumption of linear model actually. 統計 POWER TRANSFORMATION NORMALIZED TRANSFORMATION CONSTRUCTED VARIABLE SCORE TEST ANDREWS'S EXACT TEST STATISTICS SENSITIVITY
6	Tests of random effects in linear and non-linear models Häggström Lundevaller, Erling January 2002 (has links) No description available. Random effects score test NLS heteroskedasticity measurement error Poisson regression trip frequencies
7	Generalized score tests for missing covariate data Jin, Lei 15 May 2009 (has links) In this dissertation, the generalized score tests based on weighted estimating equations are proposed for missing covariate data. Their properties, including the effects of nuisance functions on the forms of the test statistics and efficiency of the tests, are investigated. Different versions of the test statistic are properly defined for various parametric and semiparametric settings. Their asymptotic distributions are also derived. It is shown that when models for the nuisance functions are correct, appropriate test statistics can be obtained via plugging the estimates of the nuisance functions into the appropriate test statistic for the case that the nuisance functions are known. Furthermore, the optimal test is obtained using the relative efficiency measure. As an application of the proposed tests, a formal model validation procedure is developed for generalized linear models in the presence of missing covariates. The asymptotic distribution of the data driven methods is provided. A simulation study in both linear and logistic regressions illustrates the applicability and the finite sample performance of the methodology. Our methods are also employed to analyze a coronary artery disease diagnostic dataset.
8	Goodness-of-Fit Test Issues in Generalized Linear Mixed Models Chen, Nai-Wei 2011 December 1900 (has links) Linear mixed models and generalized linear mixed models are random-effects models widely applied to analyze clustered or hierarchical data. Generally, random effects are often assumed to be normally distributed in the context of mixed models. However, in the mixed-effects logistic model, the violation of the assumption of normally distributed random effects may result in inconsistency for estimates of some fixed effects and the variance component of random effects when the variance of the random-effects distribution is large. On the other hand, summary statistics used for assessing goodness of fit in the ordinary logistic regression models may not be directly applicable to the mixed-effects logistic models. In this dissertation, we present our investigations of two independent studies related to goodness-of-fit tests in generalized linear mixed models. First, we consider a semi-nonparametric density representation for the random effects distribution and provide a formal statistical test for testing normality of the random-effects distribution in the mixed-effects logistic models. We obtain estimates of parameters by using a non-likelihood-based estimation procedure. Additionally, we not only evaluate the type I error rate of the proposed test statistic through asymptotic results, but also carry out a bootstrap hypothesis testing procedure to control the inflation of the type I error rate and to study the power performance of the proposed test statistic. Further, the methodology is illustrated by revisiting a case study in mental health. Second, to improve assessment of the model fit in the mixed-effects logistic models, we apply the nonparametric local polynomial smoothed residuals over within-cluster continuous covariates to the unweighted sum of squares statistic for assessing the goodness-of-fit of the logistic multilevel models. We perform a simulation study to evaluate the type I error rate and the power performance for detecting a missing quadratic or interaction term of fixed effects using the kernel smoothed unweighted sum of squares statistic based on the local polynomial smoothed residuals over x-space. We also use a real data set in clinical trials to illustrate this application. Generalized linear mixed model generalized estimating equations robust score test parametric bootstrap.
9	Identifying Influential Observations in Nonlinear Regression : a focus on parameter estimates and the score test Stål, Karin January 2015 (has links) This thesis contributes to influence analysis in nonlinear regression and in particular the detection of influential observations. The focus is on a regression model with a known mean function, which is nonlinear in its parameters and where the function is chosen according to the knowledge about the process generating the data. The error term in the regression model is assumed to be additive. The main goal of this thesis is to work out diagnostic measures for assessing the influence of observations on various results from a nonlinear regression analysis. The obtained results comprise diagnostic tools for detecting observations that, individually or jointly with some other observations, are influential on the parameter estimates. Moreover, assessing conditional influence, i.e. the influence of an observation conditional on the deletion of another observation, is of interest. This can help to identify influential observations which could be missed due to complex relationships among the observations. Novelties of the proposed diagnostic tools include the possibility to assess influence of observations on a specific parameter estimate and to assess influence of multiple observations. A further emphasis of this thesis is on the observations' influence on the outcome of a hypothesis testing procedure based on Rao's score test. An innovative solution to the problem of visual identification of influential observations regarding the score test statistic obtained in this thesis is the so called added parameter plot. As a complement to the added parameter plot, new diagnostic measures are derived for assessing the influence of single and multiple observations on the score test statistic. Added parameter plot differentiation approach influential observation nonlinear regression score test
10	New Score Tests for Genetic Linkage Analysis in a Likelihood Framework Song, Yeunjoo E. 12 March 2013 (has links) No description available. Bioinformatics Biostatistics Epidemiology Genetics Statistics genetic linkage analysis score test likelihood pedigree information correlated data

Search results