Global ETD Search

41	Two-Sample Testing of High-Dimensional Covariance Matrices Sun, Nan, 0000-0003-0278-5254 January 2021 (has links) Testing the equality between two high-dimensional covariance matrices is challenging. As the most efficient way to measure evidential discrepancies in observed data, the likelihood ratio test is expected to be powerful when the null hypothesis is violated. However, when the data dimensionality becomes large and potentially exceeds the sample size by a substantial margin, likelihood ratio based approaches face practical and theoretical challenges. To solve this problem, this study proposes a method by which we first randomly project the original high-dimensional data into lower-dimensional space, and then apply the corrected likelihood ratio tests developed with random matrix theory. We show that testing with a single random projection is consistent under the null hypothesis. Through evaluating the power function, which is challenging in this context, we provide evidence that the test with a single random projection based on a random projection matrix with reasonable column sizes is more powerful when the two covariance matrices are unequal but component-wise discrepancy could be small -- a weak and dense signal setting. To more efficiently utilize this data information, we propose combined tests from multiple random projections from the class of meta-analyses. We establish the foundation of the combined tests from our theoretical analysis that the p-values from multiple random projections are asymptotically independent in the high-dimensional covariance matrices testing problem. Then, we show that combined tests from multiple random projections are consistent under the null hypothesis. In addition, our theory presents the merit of certain meta-analysis approaches over testing with a single random projection. Numerical evaluation of the power function of the combined tests from multiple random projections is also provided based on numerical evaluation of power function of testing with a single random projection. Extensive simulations and two real genetic data analyses confirm the merits and potential applications of our test. / Statistics Statistics Corrected likelihood ratio test Covariance matrix Hypothesis testing Meta analysis Random matrix theory Random projections
42	Parameter Estimation and Hypothesis Testing for the Truncated Normal Distribution with Applications to Introductory Statistics Grades Hattaway, James T. 09 March 2010 (has links) (PDF) The normal distribution is a commonly seen distribution in nature, education, and business. Data that are mounded or bell shaped are easily found across various fields of study. Although there is high utility with the normal distribution; often the full range can not be observed. The truncated normal distribution accounts for the inability to observe the full range and allows for inferring back to the original population. Depending on the amount of truncation, the truncated normal has several distinct shapes. A simulation study evaluating the performance of the maximum likelihood estimators and method of moment estimators is conducted and a comparison of performance is made. The α Likelihood Ratio Test (LRT) is derived for testing the null hypothesis of equal population means for truncated normal data. A simulation study evaluating the power of the LRT to detect absolute standardized differences between the two population means with small sample size was conducted and the power curves were approximated. Another simulation study evaluating the power of the LRT to detect absolute differences for testing the hypothesis with large unequal sample sizes was conducted. The α LRT was extended to a k population hypothesis test for equal population means. A simulation study examining the power of the k population LRT for detecting absolute standardized differences when one of the population means is different than the others was conducted and the power curve approximated. Stat~221 is the largest introductory statistics course at BYU serving about 4,500 students a year. Every section of Stat 221 shares common homework assignments and tests. This controls for confounding when making comparisons between sections. Historically grades have been thought to be bell shaped, but with grade inflation and other factors, the upper tail is lost because of the truncation at 100. It is reasonable to assume that grades follow a truncated normal distribution. Inference using the final grades should be done recognizing the truncation. Performance of the different Stat 221 sections was evaluated using the LRTs derived. maximum likelihood estimators method of moments estimators likelihood ratio test assessment of student learning Stat 221 Statistics and Probability
43	GLR Control Charts for Monitoring the Mean Vector or the Dispersion of a Multivariate Normal Process Wang, Sai 28 February 2012 (has links) In many applications, the quality of process outputs is described by more than one characteristic variable. These quality variables usually follow a multivariate normal (MN) distribution. This dissertation discusses the monitoring of the mean vector and the covariance matrix of MN processes. The first part of this dissertation develops a statistical process control (SPC) chart based on a generalized likelihood ratio (GLR) statistic to monitor the mean vector. The performance of the GLR chart is compared to the performance of the Hotelling Χ² chart, the multivariate exponentially weighted moving average (MEWMA) chart, and a multi-MEWMA combination. Results show that the Hotelling Χ² chart and the MEWMA chart are only effective for a small range of shift sizes in the mean vector, while the GLR chart and some carefully designed multi-MEWMA combinations can give similarly better overall performance in detecting a wide range of shift magnitudes. Unlike most of these other options, the GLR chart does not require specification of tuning parameter values by the user. The GLR chart also has the advantage in process diagnostics: at the time of a signal, estimates of change-point and out-of-control mean vector are immediately available to the user. All these advantages of the GLR chart make it a favorable option for practitioners. For the design of the GLR chart, a series of easy to use equations are provided to users for calculating the control limit to achieve the desired in-control performance. The use of this GLR chart with a variable sampling interval (VSI) scheme has also been evaluated and discussed. The rest of the dissertation considers the problem of monitoring the covariance matrix. Three GLR charts with different covariance matrix estimators have been discussed. Results show that the GLR chart with a multivariate exponentially weighted moving covariance (MEWMC) matrix estimator is slightly better than the existing method for detecting any general changes in the covariance matrix, and the GLR chart with a constrained maximum likelihood estimator (CMLE) gives much better overall performance for detecting a wide range of shift sizes than the best available options for detecting only variance increases. / Ph. D. Surveillance Statistical process control Quality control Multivariate EWMA Multivariate CUSUM Generalized likelihood ratio Change point Covariance
44	GLR Control Charts for Process Monitoring with Sequential Sampling Peng, Yiming 06 November 2014 (has links) The objective of this dissertation is to investigate GLR control charts based on a sequential sampling scheme (SS GLR charts). Phase II monitoring is considered and the goal is to quickly detect a wide range of changes in the univariate normal process mean parameter and/or the variance parameter. The performance of the SS GLR charts is evaluated and design guidelines for SS GLR charts are provided so that practitioners can easily apply the SS GLR charts in applications. More specifically, the structure of this dissertation is as follows: We first develop a two-sided SS GLR chart for monitoring the mean μ of a normal process. The performance of the SS GLR chart is evaluated and compared with other control charts. The SS GLR chart has much better performance than that of the fixed sampling rate GLR chart. It is also shown that the overall performance of the SS GLR chart is better than that of the variable sampling interval (VSI) GLR chart and the variable sampling rate (VSR) CUSUM chart. The SS GLR chart has the additional advantage that it requires fewer parameters to be specified than other VSR charts. The optimal parameter choices are given, and regression equations are provided to find the limits for the SS GLR chart. If detecting one-sided shifts in μ is of interest, the above SS GLR chart can be modified to be a one-sided chart. The performance of this modified SS GLR chart is investigated. Next we develop an SS GLR chart for simultaneously monitoring the mean μ and the variance 𝜎² of a normal process. The performance and properties of this chart are evaluated. The design methodology and some illustrative examples are provided so that the SS GLR chart can be easily used in applications. The optimal parameter choices are given, and the performance of the SS GLR chart remains very good as long as the parameter choices are not too far away from the optimized choices. / Ph. D. Average Time to Signal Generalized Likelihood Ratio Sequential Sampling Statistical Process Control Variable Sampling Rate
45	The Monitoring of Linear Profiles and the Inertial Properties of Control Charts Mahmoud, Mahmoud A. 17 November 2004 (has links) The Phase I analysis of data when the quality of a process or product is characterized by a linear function is studied in this dissertation. It is assumed that each sample collected over time in the historical data set consists of several bivariate observations for which a simple linear regression model is appropriate, a situation common in calibration applications. Using a simulation study, the researcher compares the performance of some of the recommended approaches used to assess the stability of the process. Also in this dissertation, a method based on using indicator variables in a multiple regression model is proposed. This dissertation also proposes a change point approach based on the segmented regression technique for testing the constancy of the regression parameters in a linear profile data set. The performance of the proposed change point method is compared to that of the most effective Phase I linear profile control chart approaches using a simulation study. The advantage of the proposed change point method over the existing methods is greatly improved detection of sustained step changes in the process parameters. Any control chart that combines sample information over time, e.g., the cumulative sum (CUSUM) chart and the exponentially weighted moving average (EWMA) chart, has an ability to detect process changes that varies over time depending on the past data observed. The chart statistics can take values such that some shifts in the parameters of the underlying probability distribution of the quality characteristic are more difficult to detect. This is referred to as the "inertia problem" in the literature. This dissertation shows under realistic assumptions that the worst-case run length performance of control charts becomes as informative as the steady-state performance. Also this study proposes a simple new measure of the inertial properties of control charts, namely the signal resistance. The conclusions of this study support the recommendation that Shewhart limits should be used with EWMA charts, especially when the smoothing parameter is small. This study also shows that some charts proposed by Pignatiello and Runger (1990) and Domangue and Patch (1991) have serious disadvantages with respect to inertial properties. / Ph. D. Segmented regression Likelihood ratio Statistical process control Calibration Multivariate charts Functional data
46	Likelihood-based testing and model selection for hazard functions with unknown change-points Williams, Matthew Richard 03 May 2011 (has links) The focus of this work is the development of testing procedures for the existence of change-points in parametric hazard models of various types. Hazard functions and the related survival functions are common units of analysis for survival and reliability modeling. We develop a methodology to test for the alternative of a two-piece hazard against a simpler one-piece hazard. The location of the change is unknown and the tests are irregular due to the presence of the change-point only under the alternative hypothesis. Our approach is to consider the profile log-likelihood ratio test statistic as a process with respect to the unknown change-point. We then derive its limiting process and find the supremum distribution of the limiting process to obtain critical values for the test statistic. We first reexamine existing work based on Taylor Series expansions for abrupt changes in exponential data. We generalize these results to include Weibull data with known shape parameter. We then develop new tests for two-piece continuous hazard functions using local asymptotic normality (LAN). Finally we generalize our earlier results for abrupt changes to include covariate information using the LAN techniques. While we focus on the cases of no censoring, simple right censoring, and censoring generated by staggered-entry; our derivations reveal that our framework should apply to much broader censoring scenarios. / Ph. D. Gaussian process Donsker class Change-point hazard function local asymptotic normality Likelihood ratio test
47	On a turbo decoder design for low power dissipation Fei, Jia 21 July 2000 (has links) A new coding scheme called "turbo coding" has generated tremendous interest in channel coding of digital communication systems due to its high error correcting capability. Two key innovations in turbo coding are parallel concatenated encoding and iterative decoding. A soft-in soft-out component decoder can be implemented using the maximum a posteriori (MAP) or the maximum likelihood (ML) decoding algorithm. While the MAP algorithm offers better performance than the ML algorithm, the computation is complex and not suitable for hardware implementation. The log-MAP algorithm, which performs necessary computations in the logarithm domain, greatly reduces hardware complexity. With the proliferation of the battery powered devices, power dissipation, along with speed and area, is a major concern in VLSI design. In this thesis, we investigated a low-power design of a turbo decoder based on the log-MAP algorithm. Our turbo decoder has two component log-MAP decoders, which perform the decoding process alternatively. Two major ideas for low-power design are employment of a variable number of iterations during the decoding process and shutdown of inactive component decoders. The number of iterations during decoding is determined dynamically according to the channel condition to save power. When a component decoder is inactive, the clocks and spurious inputs to the decoder are blocked to reduce power dissipation. We followed the standard cell design approach to design the proposed turbo decoder. The decoder was described in VHDL, and then synthesized to measure the performance of the circuit in area, speed and power. Our decoder achieves good performance in terms of bit error rate. The two proposed methods significantly reduce power dissipation and energy consumption. / Master of Science Branch metric State metric Low power Synopsys Log-likelihood ratio Log-MAP Turbo decoder
48	Estimation paramétriques et tests d'hypothèses pour des modèles avec plusieurs ruptures d'un processus de poisson / Parametric estimation and hypothesis testing for models with multiple change-point of poisson process Top, Alioune 20 June 2016 (has links) Ce travail est consacré aux problèmes d’estimation paramétriques, aux tests d’hypothèses et aux tests d’ajustement pour les processus de Poisson non homogènes.Tout d’abord on a étudié deux modèles ayant chacun deux sauts localisés par un paramètre inconnu. Pour le premier modèle la somme des sauts est positive. Tandis que le second a un changement de régime et constant par morceaux. La somme de ses deux sauts est nulle. Ainsi pour chacun de ces modèles nous avons étudié les propriétés asymptotiques de l’estimateur bayésien (EB) et celui du maximum de vraisemblance(EMV). Nous avons montré la consistance, la convergence en distribution et la convergence des moments. En particulier l’estimateur bayésien est asymptotiquement efficace. Pour le second modèle nous avons aussi considéré le test d’une hypothèse simple contre une alternative unilatérale et nous avons décrit les propriétés asymptotiques (choix du seuil et puissance ) du test de Wald (WT)et du test du rapport de vraisemblance généralisé (GRLT).Les démonstrations sont basées sur la méthode d’Ibragimov et Khasminskii. Cette dernière repose sur la convergence faible du rapport de vraisemblance normalisé dans l’espace de Skorohod sous certains critères de tension des familles demesure correspondantes.Par des simulations numériques, les variances limites nous ont permis de conclure que l’EB est meilleur que celui du EMV. Lorsque la somme des sauts est nulle, nous avons développé une approche numérique pour le EMV.Ensuite on a considéré le problème de construction d’un test d’ajustement pour un modèle avec un paramètre d’échelle. On a montré que dans ce cas, le test de Cramer-von Mises est asymptotiquement ”parameter-free” et est consistent. / This work is devoted to the parametric estimation, hypothesis testing and goodnessof-fit test problems for non homogenous Poisson processes. First we consider two models having two jumps located by an unknown parameter.For the first model the sum of jumps is positive. The second is a model of switching intensity, piecewise constant and the sum of jumps is zero. Thus, for each model, we studied the asymptotic properties of the Bayesian estimator (BE) andthe likelihood estimator (MLE). The consistency, the convergence in distribution and the convergence of moments are shown. In particular we show that the BE is asymptotically efficient. For the second model we also consider the problem of asimple hypothesis testing against a one- sided alternative. The asymptotic properties (choice of the threshold and power) of Wald test (WT) and the generalized likelihood ratio test (GRLT) are described.For the proofs we use the method of Ibragimov and Khasminskii. This method is based on the weak convergence of the normalized likelihood ratio in the Skorohod space under some tightness criterion of the corresponding families of measure.By numerical simulations, the limiting variances of estimators allows us to conclude that the BE outperforms the MLE. In the situation where the sum of jumps is zero, we developed a numerical approach to obtain the MLE.Then we consider the problem of construction of goodness-of-test for a model with scale parameter. We show that the Cram´er-von Mises type test is asymptotically parameter-free. It is also consistent. Estimation paramétrique Tests d'hypothèses Estimateur bayésien Estimateur du maximum de vraisemblance Processus de Poisson Rapport de vraisemblance Modèles de ruptures Parametric estimation Hypothesis testing Bayesian estimator Maximum likelihood ratio Poisson process Change point model Likelihood ratio 519.5
49	Análise de agrupamento de semeadoras manuais quanto à distribuição do número de sementes / Cluster analysis of manual planters according to the distribution of the number of seeds Araripe, Patricia Peres 10 December 2015 (has links) A semeadora manual é uma ferramenta que, ainda nos dias de hoje, exerce um papel importante em diversos países do mundo que praticam a agricultura familiar e de conservação. Sua utilização é de grande importância devido a minimização do distúrbio do solo, exigências de trabalho no campo, maior produtividade sustentável entre outros fatores. De modo a avaliar e/ou comparar as semeadoras manuais existentes no mercado, diversos trabalhos têm sido realizados, porém considerando somente medidas de posição e dispersão. Neste trabalho é utilizada, como alternativa, uma metodologia para a comparação dos desempenhos das semeadoras manuais. Neste caso, estimou-se as probabilidades associadas a cada categoria de resposta e testou-se a hipótese de que essas probabilidades não variam para as semeadoras quando comparadas duas a duas, utilizando o teste da razão das verossimilhanças e o fator de Bayes nos paradigmas clássico e bayesiano, respectivamente. Por fim, as semeadoras foram agrupadas considerando, como medida de distância, a medida de divergência J-divergência na análise de agrupamento. Como ilustração da metodologia apresentada, são considerados os dados para a comparação de quinze semeadoras manuais de diferentes fabricantes analisados por Molin, Menegatti e Gimenez (2001) em que as semeadoras foram reguladas para depositarem exatamente duas sementes por golpe. Inicialmente, na abordagem clássica, foram comparadas as semeadoras que não possuíam valores nulos nas categorias de resposta, sendo as semeadoras 3, 8 e 14 as que apresentaram melhores comportamentos. Posteriormente, todas as semeadoras foram comparadas duas a duas, agrupando-se as categorias e adicionando as contantes 0,5 ou 1 à cada categoria de resposta. Ao agrupar categorias foi difícil a tomada de conclusões pelo teste da razão de verossimilhanças, evidenciando somente o fato da semeadora 15 ser diferente das demais. Adicionando 0,5 ou 1 à cada categoria não obteve-se, aparentemente, a formação de grupos distintos, como a semeadora 1 pelo teste diferiu das demais e apresentou maior frequência no depósito de duas sementes, o exigido pelo experimento agronômico, foi a recomendada neste trabalho. Na abordagem bayesiana, utilizou-se o fator de Bayes para comparar as semeadoras duas a duas, no entanto as conclusões foram semelhantes às obtidas na abordagem clássica. Finalmente, na análise de agrupamento foi possível uma melhor visualização dos grupos de semeadoras semelhantes entre si em ambas as abordagens, reafirmando os resultados obtidos anteriormente. / The manual planter is a tool that today still has an important role in several countries around the world, which practices family and conservation agriculture. The use of it has importance due to minimizing soil disturbance, labor requirements in the field, most sustainable productivity and other factors. In order to analyze and/or compare the commercial manual planters, several studies have been conducted, but considering only position and dispersion measures. This work presents an alternatively method for comparing the performance of manual planters. In this case, the probabilities associated with each category of response has estimated and the hypothesis that these probabilities not vary for planters when compared in pairs evaluated using the likelihood ratio test and Bayes factor in the classical and bayesian paradigms, respectively. Finally, the planters were grouped considering as a measure of distance, the divergence measure J-divergence in the cluster analysis. As an illustration of this methodology, the data from fifteen manual planters adjusted to deposit exactly two seeds per hit of different manufacturers analyzed by Molin, Menegatti and Gimenez (2001) were considered. Initially, in the classical approach, the planters without zero values in response categories were compared and the planters 3, 8 and 14 presents the better behavior. After, all the planters were compared in pairs, grouping categories and adding the constants 0,5 or 1 for each response category. Grouping categories was difficult making conclusions by the likelihood ratio test, only highlighting the fact that the planter 15 is different from others. Adding 0,5 or 1 for each category, apparently not obtained the formation of different groups, such as planter 1 which by the test differed from the others and presented more frequently the deposit of two seeds, required by agronomic experiment and recommended in this work. In the Bayesian approach, the Bayes factor was used to compare the planters in pairs, but the findings were similar to those obtained in the classical approach. Finally, the cluster analysis allowed a better idea of similar planters groups with each other in the both approaches, confirming the results obtained previously. Análise de agrupamentos Bayes factor Cluster analysis Fator de Bayes Likelihood ratio test Manual planter Semeadora manual Teste da razão de verossimilhanças
50	Latent variable models for longitudinal twin data Dominicus, Annica January 2006 (has links) <p>Longitudinal twin data provide important information for exploring sources of variation in human traits. In statistical models for twin data, unobserved genetic and environmental factors influencing the trait are represented by latent variables. In this way, trait variation can be decomposed into genetic and environmental components. With repeated measurements on twins, latent variables can be used to describe individual trajectories, and the genetic and environmental variance components are assessed as functions of age. This thesis contributes to statistical methodology for analysing longitudinal twin data by (i) exploring the use of random change point models for modelling variance as a function of age, (ii) assessing how nonresponse in twin studies may affect estimates of genetic and environmental influences, and (iii) providing a method for hypothesis testing of genetic and environmental variance components. The random change point model, in contrast to linear and quadratic random effects models, is shown to be very flexible in capturing variability as a function of age. Approximate maximum likelihood inference through first-order linearization of the random change point model is contrasted with Bayesian inference based on Markov chain Monte Carlo simulation. In a set of simulations based on a twin model for informative nonresponse, it is demonstrated how the effect of nonresponse on estimates of genetic and environmental variance components depends on the underlying nonresponse mechanism. This thesis also reveals that the standard procedure for testing variance components is inadequate, since the null hypothesis places the variance components on the boundary of the parameter space. The asymptotic distribution of the likelihood ratio statistic for testing variance components in classical twin models is derived, resulting in a mixture of chi-square distributions. Statistical methodology is illustrated with applications to empirical data on cognitive function from a longitudinal twin study of aging. </p> Latent variable models twin models variance components change point models non-ignorable nonresponse likelihood ratio tests Mathematical statistics Matematisk statistik

Search results