Global ETD Search

91	An analysis of primary military occupational specialties on retention and promotion of mid-grade officers in the U.S. Marine Corps Perry, Tracy A. 03 1900 (has links) The purpose of this thesis is to identify and evaluate factors that affect retention and promotion of mid-grade officers in the U.S. Marine Corps. The analysis includes evaluation of survival patterns to ten-years of commissioned service and promotion patterns to O-4 and O-5. The primary goal is to explain the effect of an officersÃ¢ primary military occupational specialty (PMOS) on retention and promotion. The Marine Corps Commissioned Officer Accession Career (MCCOAC) data file contains cohort information from FY 1980 through FY 1999 and includes 27,659 observations. Using data from the MCCOAC data file, logistic regression and Cox Proportional Hazard models are used to estimate the effects of an officerÃ¢ s PMOS on survival and promotion patterns of Marine Corps officers. The findings indicate that an officers PMOS is significantly associated with whether an officer stays until 10 YCS or is promoted to O-4 or O-5. Logistic regression results show that pilot PMOSs are positively correlated with surviving until 10 YCS, but are negatively correlated with promotion to O-4, when compared to Infantry. The results also find that the remaining PMOSs are negatively correlated with whether and officer survives until 10 YCS, when compared to Infantry. In addition, only three PMOSs (0402, 7202, and 7523) are positively correlated with whether an officer is promoted to O-4 or O-5. Finally, the Cox Proportional Hazard results show the effect of having a particular PMOS or occupational field on the hazards of separation and promotion. Manpower policy Professions Regression analysis
92	On optimal allocation problem in multi-group extreme value regression under censoring. January 2006 (has links) Ka Cheuk Yin Timothy. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2006. / Includes bibliographical references (leaves 52-54). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Stress Test --- p.1 / Chapter 1.2 --- Extreme Value Regression --- p.2 / Chapter 1.3 --- Type II Censoring --- p.4 / Chapter 1.4 --- Test Plan --- p.5 / Chapter 1.5 --- The Scope of the Thesis --- p.6 / Chapter 2 --- Extreme Value Regression Model --- p.7 / Chapter 2.1 --- Introduction --- p.7 / Chapter 2.2 --- Maximum Likelihood Estimation --- p.8 / Chapter 2.3 --- Variance-Covariance Matrix --- p.9 / Chapter 3 --- Optimality Criteria and Allocation Methods --- p.15 / Chapter 3.1 --- Introduction --- p.15 / Chapter 3.2 --- Optimality Criteria --- p.16 / Chapter 3.3 --- Allocation Methods --- p.17 / Chapter 4 --- Asymptotic Results --- p.21 / Chapter 4.1 --- Introduction --- p.21 / Chapter 4.2 --- Asymptotic Variance-Covariance Matrix --- p.22 / Chapter 4.3 --- Optimality Criteria --- p.29 / Chapter 5 --- Optimal Allocations --- p.32 / Chapter 5.1 --- Introduction --- p.32 / Chapter 5.2 --- Allocation for small sample size --- p.33 / Chapter 5.2.1 --- 2-stress-level case --- p.33 / Chapter 5.2.2 --- 4-stress-level case --- p.34 / Chapter 5.2.3 --- Suggested Optimal Allocation --- p.39 / Chapter 5.2.4 --- Comparison with the complete sample case --- p.43 / Chapter 5.3 --- Asymptotic Allocations --- p.44 / Chapter 6 --- Conclusions and Further Research --- p.50 / Bibliography --- p.52 Extreme value theory Regression analysis
93	Supervised ridge regression in high dimensional linear regression. / 高維線性回歸的監督嶺回歸 / CUHK electronic theses & dissertations collection / Gao wei xian xing hui gui de jian du ling hui gui January 2013 (has links) 在機器學習領域，我們通常有很多的特徵變量，以確定一些回應變量的行為。例如在基因測試問題，我們有數以萬計的基因用來作為特徵變量，而它們與某些疾病的關係需要被確定。沒有提供具體的知識，最簡單和基本的方法來模擬這種問題會是一個線性的模型。有很多現成的方法來解決線性回歸問題，像傳統的普通最小二乘回歸法，嶺回歸和套索回歸。設 N 為樣本數和，p 為特徵變量數，在普通的情況下，我們通常有足夠的樣本（N> P）。在這種情況下，普通線性回歸的方法，例如嶺回歸通常會給予合理的對未來的回應變量測值的預測。隨著現代統計學的發展，我們經常會遇到高維問題（N << P），如 DNA 芯片數據的測試問題。在這些類型的高維問題中，確定特徵變量和回應變量之間的關係在沒有任何進一步的假設的情況下是相當困難的。在很多現實問題中，儘管有大量的特徵變量存在，但是完全有可能只有極少數的特徵變量和回應變量有直接關係，而大部分其他的特徵變量都是無效的。套索和嶺回歸等傳統線性回歸在高維問題中有其局限性。套索回歸在應用於高維問題時，會因為測量噪聲的存在而表現得很糟糕，這將導致非常低的預測準確率。嶺回歸也有其明顯的局限性。它不能夠分開真正的特徵變量和無效的特徵變量。我提出的新方法的目的就是在高維線性回歸中克服以上兩種方法的局限性，從而導致更精確和穩定的預測。想法其實很簡單，與其做一個單一步驟的線性回歸，我們將回歸過程分成兩個步驟。第一步，我们棄那些預測有相關性很小或為零的特徵變量。第二步，我們應該得到一個消減過的特徵變量集，我們將用這個集和回應變量來進行嶺回歸從而得到我們需要的結果。 / In the field of statistical learning, we usually have a lot of features to determine the behavior of some response. For example in gene testing problems we have lots of genes as features and their relations with certain disease need to be determined. Without specific knowledge available, the most simple and fundamental way to model this kind of problem would be a linear model. There are many existing method to solve linear regression, like conventional ordinary least squares, ridge regression and LASSO (least absolute shrinkage and selection operator). Let N denote the number of samples and p denote the number of predictors, in ordinary settings where we have enough samples (N > p), ordinary linear regression methods like ridge regression will usually give reasonable predictions for the future values of the response. In the development of modern statistical learning, it's quite often that we meet high dimensional problems (N << p), like documents classification problems and microarray data testing problems. In high-dimensional problems it is generally quite difficult to identify the relationship between the predictors and the response without any further assumptions. Despite the fact that there are many predictors for prediction, most of the predictors are actually spurious in a lot of real problems. A predictor being spurious means that it is not directly related to the response. For example in microarray data testing problems, millions of genes may be available for doing prediction, but only a few hundred genes are actually related to the target disease. Conventional techniques in linear regression like LASSO and ridge regression both have their limitations in high-dimensional problems. The LASSO is one of the "state of the art technique for sparsity recovery, but when applied to high-dimensional problems, LASSO's performance is degraded a lot due to the presence of the measurement noise, which will result in high variance prediction and large prediction error. Ridge regression on the other hand is more robust to the additive measurement noise, but has its obvious limitation of not being able to separate true predictors from spurious predictors. As mentioned previously in many high-dimensional problems a large number of the predictors could be spurious, then in these cases ridge's disability in separating spurious and true predictors will result in poor interpretability of the model as well as poor prediction performance. The new technique that I will propose in this thesis aims to accommodate for the limitations of these two methods thus resulting in more accurate and stable prediction performance in a high-dimensional linear regression problem with signicant measurement noise. The idea is simple, instead of the doing a single step regression, we divide the regression procedure into two steps. In the first step we try to identify the seemingly relevant predictors and those that are obviously spurious by calculating the uni-variant correlations between the predictors and the response. We then discard those predictors that have very small or zero correlation with the response. After the first step we should have obtained a reduced predictor set. In the second step we will perform a ridge regression between the reduced predictor set and the response, the result of this ridge regression will then be our desired output. The thesis will be organized as follows, first I will start with a literature review about the linear regression problem and introduce in details about the ridge and LASSO and explain more precisely about their limitations in high-dimensional problems. Then I will introduce my new method called supervised ridge regression and show the reasons why it should dominate the ridge and LASSO in high-dimensional problems, and some simulation results will be demonstrated to strengthen my argument. Finally I will conclude with the possible limitations of my method and point out possible directions for further investigations. / Detailed summary in vernacular field only. / Zhu, Xiangchen. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 68-69). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Chapter 1. --- BASICS ABOUT LINEAR REGRESSION --- p.2 / Chapter 1.1 --- Introduction --- p.2 / Chapter 1.2 --- Linear Regression and Least Squares --- p.2 / Chapter 1.2.1 --- Standard Notations --- p.2 / Chapter 1.2.2 --- Least Squares and Its Geometric Meaning --- p.4 / Chapter 2. --- PENALIZED LINEAR REGRESSION --- p.9 / Chapter 2.1 --- Introduction --- p.9 / Chapter 2.2 --- Deficiency of the Ordinary Least Squares Estimate --- p.9 / Chapter 2.3 --- Ridge Regression --- p.12 / Chapter 2.3.1 --- Introduction to Ridge Regression --- p.12 / Chapter 2.3.2 --- Expected Prediction Error And Noise Variance Decomposition of Ridge Regression --- p.13 / Chapter 2.3.3 --- Shrinkage effects on different principal components by ridge regression --- p.18 / Chapter 2.4 --- The LASSO --- p.22 / Chapter 2.4.1 --- Introduction to the LASSO --- p.22 / Chapter 2.4.2 --- The Variable Selection Ability and Geometry of LASSO --- p.25 / Chapter 2.4.3 --- Coordinate Descent Algorithm to solve for the LASSO --- p.28 / Chapter 3. --- LINEAR REGRESSION IN HIGH-DIMENSIONAL PROBLEMS --- p.31 / Chapter 3.1 --- Introduction --- p.31 / Chapter 3.2 --- Spurious Predictors and Model Notations for High-dimensional Linear Regression --- p.32 / Chapter 3.3 --- Ridge and LASSO in High-dimensional Linear Regression --- p.34 / Chapter 4. --- THE SUPERVISED RIDGE REGRESSION --- p.39 / Chapter 4.1 --- Introduction --- p.39 / Chapter 4.2 --- Definition of Supervised Ridge Regression --- p.39 / Chapter 4.3 --- An Underlying Latent Model --- p.43 / Chapter 4.4 --- Ridge LASSO and Supervised Ridge Regression --- p.45 / Chapter 4.4.1 --- LASSO vs SRR --- p.45 / Chapter 4.4.2 --- Ridge regression vs SRR --- p.46 / Chapter 5. --- TESTING AND SIMULATION --- p.49 / Chapter 5.1 --- A Simulation Example --- p.49 / Chapter 5.2 --- More Experiments --- p.54 / Chapter 5.2.1 --- Correlated Spurious and True Predictors --- p.55 / Chapter 5.2.2 --- Insufficient Amount of Data Samples --- p.59 / Chapter 5.2.3 --- Low Dimensional Problem --- p.62 / Chapter 6. --- CONCLUSIONS AND DISCUSSIONS --- p.66 / Chapter 6.1 --- Conclusions --- p.66 / Chapter 6.2 --- References and Related Works --- p.68 Regression analysis Ridge regression (Statistics)
94	Flexible Regression Models for Estimating Interactions between a Treatment and Scalar/Functional Predictors Park, Hyung January 2018 (has links) In this dissertation, we develop regression models for estimating interactions between a treatment variable and a set of baseline predictors in their eect on the outcome in a randomized trial, without restriction to a linear relationship. The proposed semiparametric/nonparametric regression approaches for representing interactions generalize the notion of an interaction between a categorical treatment variable and a set of predictors on the outcome, from a linear model context. In Chapter 2, we develop a model for determining a composite predictor from a set of baseline predictors that can have a nonlinear interaction with the treatment indicator, implying that the treatment efficacy can vary across values of such a predictor without a linearity restriction. We introduce a parsimonious generalization of the single-index models that targets the eect of the interaction between the treatment conditions and the vector of predictors on the outcome. A common approach to interrogate such treatment-by-predictor interaction is to t a regression curve as a function of the predictors separately for each treatment group. For parsimony and insight, we propose a single-index model with multiple-links that estimates a single linear combination of the predictors (i.e., a single-index), with treatment-specic nonparametrically-dened link functions. The approach emphasizes a focus on the treatment-by-predictors interaction eects on the treatment outcome that are relevant for making optimal treatment decisions. Asymptotic results for estimator are obtained under possible model misspecication. A treatment decision rule based on the derived single-index is dened, and it is compared to other methods for estimating optimal treatment decision rules. An application to a clinical trial for the treatment of depression is presented to illustrate the proposed approach for deriving treatment decision rules. In Chapter 3, we allow the proposed single-index model with multiple-links to have an unspecified main effect of the predictors on the outcome. This extension greatly increases the utility of the proposed regression approach for estimating the treatment-by-predictors interactions. By obviating the need to model the main eect, the proposed method extends the modied covariate approach of [Tian et al., 2014] into a semiparametric regression framework. Also, the approach extends [Tian et al., 2014] into general K treatment arms. In Chapter 4, we introduce a regularization method to deal with the potential high dimensionality of the predictor space and to simultaneously select relevant treatment effect modiers exhibiting possibly nonlinear associations with the outcome. We present a set of extensive simulations to illustrate the performance of the treatment decision rules estimated from the proposed method. An application to a clinical trial for the treatment of depression is presented to illustrate the proposed approach for deriving treatment decision rules. In Chapter 5, we develop a novel additive regression model for estimating interactions between a treatment and a potentially large number of functional/scalar predictor. If the main effect of baseline predictors is misspecied or high-dimensional (or, innite dimensional), any standard nonparametric or semiparametric approach for estimating the treatment-bypredictors interactions tends to be not satisfactory because it is prone to (possibly severe) inconsistency and poor approximation to the true treatment-by-predictors interaction effect. To deal with this problem, we impose a constraint on the model space, giving the orthogonality between the main and the interaction effects. This modeling method is particularly appealing in the functional regression context, since a functional predictor, due to its infinite dimensional nature, must go through some sort of dimension reduction, which essentially involves a main effect model misspecication. The main effect and the interaction effect can be estimated separately due to the orthogonality between the two effects, which side-steps the issue of misspecication of the main effect. The proposed approach extends the modied covariate approach of [Tian et al., 2014] into an additive regression model framework. We impose a concave penalty in estimation, and the method simultaneously selects functional/scalar treatment effect modifiers that exhibit possibly nonlinear interaction effects with the treatment indicator. The dissertation concludes in Chapter 6. Biometry Regression analysis Therapeutics Statistics
95	Benchmarking non-linear series with quasi-linear regression. January 2012 (has links) 一個社會經濟學的目標變量，經常存在兩種不同收集頻率的數據。由於較低頻率的一組數據通常由大型普查中所獲得，其準確度及可靠性會較高。因此較低頻率的一組數據一般會視作基準，用作對頻率較高的另一組數據進行修正。 / 在基準修正過程中，一般會假設調查誤差及目標數據的大小互相獨立，即「累加模型」。然而，現實中兩者通常是相關的，目標變量越大，調查誤差亦會越大，即「乘積模型」。對此問題，陳兆國及胡家浩提出了利用準線性回歸手法對乘積模型進行基準修正。在本論文中，假設調查誤差服從AR(1)模型，首先我們會示範如何利用準線性回歸手法及默認調查誤差模型進行基準數據修正。然後，運用基準預測的方式，提出一個對調查誤差模型的估計辦法。最後我們會比較兩者的表現以及一些選擇誤差模型的指引。 / For a target socio-economic variable, two sources of data with different collecting frequencies may be available in survey data analysis. In general, due to the difference of sample size or the data source, two sets of data do not agree with each other. Usually, the more frequent observations are less reliable, and the less frequent observations are much more accurate. In benchmarking problem, the less frequent observations can be treated as benchmarks, and will be used to adjust the higher frequent data. / In the common benchmarking setting, the survey error and the target variable are always assumed to be independent (Additive case). However, in reality, they should be correlated (Multiplicative case). The larger the variable, the larger the survey error. To deal with this problem, Chen and Wu (2006) proposed a regression method called quasi-linear regression for the multiplicative case. In this paper, by assuming the survey error to be an AR(1) model, we will demonstrate the benchmarking procedure using default error model for the quasi-linear regression. Also an error modelling procedure using benchmark forecast method will be proposed. Finally, we will compare the performance of the default error model with the fitted error model. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Luk, Wing Pan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 56-57). / Abstracts also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Recent Development For Benchmarking Methods --- p.2 / Chapter 1.2 --- Multiplicative Case And Benchmarking Problem --- p.3 / Chapter 2 --- Benchmarking With Quasi-linear Regression --- p.8 / Chapter 2.1 --- Iterative Procedure For Quasi-linear Regression --- p.9 / Chapter 2.2 --- Prediction Using Default Value φ --- p.16 / Chapter 2.3 --- Performance Of Using Default Error Model --- p.17 / Chapter 3 --- Estimation Of φ Via BM Forecasting method --- p.26 / Chapter 3.1 --- Benchmark Forecasting Method --- p.26 / Chapter 3.2 --- Performance Of Benchmark Forecasting Method --- p.28 / Chapter 4 --- Benchmarking By The Estimated Value --- p.34 / Chapter 4.1 --- Benchmarking With The Estimated Error Model --- p.35 / Chapter 4.2 --- Performance Of Using Estimated Error Model --- p.36 / Chapter 4.3 --- Suggestions For Selecting Error Model --- p.45 / Chapter 5 --- Fitting AR(1) Model For Non-AR(1) Error --- p.47 / Chapter 5.1 --- Settings For Non-AR(1) Model --- p.47 / Chapter 5.2 --- Simulation Studies --- p.48 / Chapter 6 --- An Illustrative Example: The Canada Total Retail Trade Se-ries --- p.50 / Chapter 7 --- Conclusion --- p.54 / Bibliography --- p.56 Regression analysis Time-series analysis
96	Influence measures for weibull regression in survival analysis. January 2003 (has links) Tsui Yuen-Yee. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaves 53-56). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Parametric Regressions in Survival Analysis --- p.6 / Chapter 2.1 --- Introduction --- p.6 / Chapter 2.2 --- Exponential Regression --- p.7 / Chapter 2.3 --- Weibull Regression --- p.8 / Chapter 2.4 --- Maximum Likelihood Method --- p.9 / Chapter 2.5 --- Diagnostic --- p.10 / Chapter 3 --- Local Influence --- p.13 / Chapter 3.1 --- Introduction --- p.13 / Chapter 3.2 --- Development --- p.14 / Chapter 3.2.1 --- Normal Curvature --- p.14 / Chapter 3.2.2 --- Conformal Normal Curvature --- p.15 / Chapter 3.2.3 --- Q-displacement Function --- p.16 / Chapter 3.3 --- Perturbation Scheme --- p.17 / Chapter 4 --- Examples --- p.21 / Chapter 4.1 --- Halibut Data --- p.21 / Chapter 4.1.1 --- The Data --- p.22 / Chapter 4.1.2 --- Initial Analysis --- p.23 / Chapter 4.1.3 --- Perturbations of σ around 1 --- p.23 / Chapter 4.2 --- Diabetic Data --- p.30 / Chapter 4.2.1 --- The Data --- p.30 / Chapter 4.2.2 --- Initial Anaylsis --- p.31 / Chapter 4.2.3 --- Perturbations of σ around σ --- p.31 / Chapter 5 --- Conclusion Remarks and Further Research Topic --- p.35 / Appendix A --- p.38 / Appendix B --- p.47 / Bibliography --- p.53 Regression analysis Survival analysis (Biometry)
97	Personality and the prediction of work performance: artificial neural networks versus linear regression Minbashian, Amirali, Psychology, Faculty of Science, UNSW January 2006 (has links) Previous research that has evaluated the effectiveness of personality variables for predicting work performance has predominantly relied on methods designed to detect simple relationships. The research reported in this thesis employed artificial neural networks ??? a method that is capable of capturing complex nonlinear and configural relationships among variables ??? and the findings were compared to those obtained by the more traditional method of linear regression. Six datasets that comprise a range of occupations, personality inventories, and work performance measures were used as the basis of the analyses. A series of studies were conducted to compare the predictive performance of prediction equations that a) were developed using either artificial neural networks or linear regression, and b) differed with respect to the type and number of personality variables that were used as predictors of work performance. Studies 1 and 2 compared the two methods using individual personality variables that assess the broad constructs of the five-factor model of personality. Studies 3 and 4 used combinations of these broad variables as the predictors. Study 5 employed narrow personality variables that assess specific facets of the broad constructs. Additional methodological contributions include the use of a resampling procedure, the use of multiple measures of predictive performance, and the comparison of two procedures for developing neural networks. Across the studies, it was generally found that the neural networks were rarely able to outperform the simpler linear regression equations, and this was attributed to the lack of reliable nonlinearity and configurality in personality-work performance relationships. However, the neural networks were able to outperform linear regression in the few instances where there was some independent evidence of nonlinear or configural relationships. Consequently, although the findings do not support the usefulness of neural networks for specifically improving the effectiveness of personality variables as predictors of work performance, in a broader sense they provide some grounds for optimism for organisational researchers interested in applying this method to investigate and exploit complex relationships among variables. Neural networks Regression analysis Personality
98	Scale parameter modelling of the t-distribution Taylor, Julian January 2005 (has links) This thesis considers location and scale parameter modelling of the heteroscedastic t-distribution. This new distribution is an extension of the heteroscedastic Gaussian and provides robust analysis in the presence of outliers as well accommodates possible heteroscedasticity by flexibly modelling the scale parameter using covariates existing in the data. To motivate components of work in this thesis the Gaussian linear mixed model is reviewed. The mixed model equations are derived for the location fixed and random effects and this model is then used to introduce Restricted Maximum Likelihood ( REML ). From this an algorithmic scheme to estimate the scale parameters is developed. A review of location and scale parameter modelling of the heteroscedastic Gaussian distribution is presented. In this thesis, the scale parameters are a restricted to be a function of covariates existing in the data. Maximum Likelihood ( ML ) and REML estimation of the location and scale parameters is derived as well as an efficient computational algorithm and software are presented. The Gaussian model is then extended by considering the heteroscedastic t distribution. Initially, the heteroscedastic t is restricted to known degrees of freedom. Scoring equations for the location and scale parameters are derived and their intimate connection to the prediction of the random scale effects is discussed. Tools for detecting and testing heteroscedasticity are also derived and a computational algorithm is presented. A mini software package " hett " using this algorithm is also discussed. To derive a REML equivalent for the heteroscedastic t asymptotic likelihood theory is discussed. In this thesis an integral approximation, the Laplace approximation, is presented and two examples, with the inclusion of ML for the heteroscedastic t, are discussed. A new approximate integral technique called Partial Laplace is also discussed and is exemplified with linear mixed models. Approximate marginal likelihood techniques using Modified Profile Likelihood ( MPL ), Conditional Profile Likelihood ( CPL ) and Stably Adjusted Profile Likelihood ( SAPL ) are also presented and offer an alternative to the approximate integration techniques. The asymptotic techniques are then applied to the heteroscedastic t when the degrees of freedom is known to form two distinct REMLs for the scale parameters. The first approximation uses the Partial Laplace approximation to form a REML for the scale parameters, whereas, the second uses the approximate marginal likelihood technique MPL. For each, the estimation of the location and scale parameters is discussed and computational algorithms are presented. For comparison, the heteroscedastic t for known degrees of freedom using ML and the two new REML equivalents are illustrated with an example and a comparative simulation study. The model is then extended to incorporate the estimation of the degrees of freedom parameter. The estimating equations for the location and scale parameters under ML are preserved and the estimation of the degrees of freedom parameter is integrated into the algorithm. The approximate REML techniques are also extended. For the Partial Laplace approximation the estimation of the degrees of freedom parameter is simultaneously estimated with the scale parameters and therefore the algorithm differs only slightly. The second approximation uses SAPL to estimate the parameters and produces approximate marginal likelihoods for the location, scale and degrees of freedom parameters. Computational algorithms for each of the techniques are also presented. Several extensive examples, as well as a comparative simulation study, are used to illustrate ML and the two REML equivalents for the heteroscedastic t with unknown degrees of freedom. The thesis is concluded with a discussion of the new techniques derived for the heteroscedastic t distribution along with their advantages and disadvantages. Topics of further research are also discussed. / Thesis (Ph.D.)--School of Agriculture and Wine, 2005. Multivariate analysis Heteroscedasticity Regression analysis
99	Application and computation of likelihood methods for regression with measurement error Higdon, Roger 23 September 1998 (has links) This thesis advocates the use of maximum likelihood analysis for generalized regression models with measurement error in a single explanatory variable. This will be done first by presenting a computational algorithm and the numerical details for carrying out this algorithm on a wide variety of models. The computational methods will be based on the EM algorithm in conjunction with the use of Gauss-Hermite quadrature to approximate integrals in the E-step. Second, this thesis will demonstrate the relative superiority of likelihood-ratio tests and confidence intervals over those based on asymptotic normality of estimates and standard errors, and that likelihood methods may be more robust in these situations than previously thought. The ability to carry out likelihood analysis under a wide range of distributional assumptions, along with the advantages of likelihood ratio inference and the encouraging robustness results make likelihood analysis a practical option worth considering in regression problems with explanatory variable measurement error. / Graduation date: 1999 Regression analysis Gaussian quadrature formulas
100	Diagnostic tools for overdispersion in generalized linear models Ganio-Gibbons, Lisa M. 18 August 1989 (has links) Data in the form of counts or proportions often exhibit more variability than that predicted by a Poisson or binomial distribution. Many different models have been proposed to account for extra-Poisson or extra-binomial variation. A simple model includes a single heterogeneity factor (dispersion parameter) in the variance. Other models that allow the dispersion parameter to vary between groups or according to a continuous covariate also exist but require a more complicated analysis. This thesis is concerned with (1) understanding the consequences of using an oversimplified model for overdispersion, (2) presenting diagnostic tools for detecting the dependence of overdispersion on covariates in regression settings for counts and proportions and (3) presenting diagnostic tools for distinguishing between some commonly used models for overdispersed data. The double exponential family of distributions is used as a foundation for this work. A double binomial or double Poisson density is constructed from a binomial or Poisson density and an additional dispersion parameter. This provides a completely parametric framework for modeling overdispersed counts and proportions. The first issue above is addressed by exploring the properties of maximum likelihood estimates obtained from incorrectly specified likelihoods. The diagnostic tools are based on a score test in the double exponential family. An attractive feature of this test is that it can be computed from the components of the deviance in the standard generalized linear model fit. A graphical display is suggested by the score test. For the normal linear model, which is a special case of the double exponential family, the diagnostics reduce to those for heteroscedasticity presented by Cook and Weisberg (1983). / Graduation date: 1990 Linear models (Statistics) Regression analysis

Search results