Global ETD Search

1	Hypothesis Testing in Finite Mixture Models Li, Pengfei 11 December 2007 (has links) Mixture models provide a natural framework for unobserved heterogeneity in a population. They are widely applied in astronomy, biology, engineering, finance, genetics, medicine, social sciences, and other areas. An important first step for using mixture models is the test of homogeneity. Before one tries to fit a mixture model, it might be of value to know whether the data arise from a homogeneous or heterogeneous population. If the data are homogeneous, it is not even necessary to go into mixture modeling. The rejection of the homogeneous model may also have scientific implications. For example, in classical statistical genetics, it is often suspected that only a subgroup of patients have a disease gene which is linked to the marker. Detecting the existence of this subgroup amounts to the rejection of a homogeneous null model in favour of a two-component mixture model. This problem has attracted intensive research recently. This thesis makes substantial contributions in this area of research. Due to partial loss of identifiability, classic inference methods such as the likelihood ratio test (LRT) lose their usual elegant statistical properties. The limiting distribution of the LRT often involves complex Gaussian processes, which can be hard to implement in data analysis. The modified likelihood ratio test (MLRT) is found to be a nice alternative of the LRT. It restores the identifiability by introducing a penalty to the log-likelihood function. Under some mild conditions, the limiting distribution of the MLRT is 1/2\chi^2_0+1/2\chi^2_1, where \chi^2_{0} is a point mass at 0. This limiting distribution is convenient to use in real data analysis. The choice of the penalty functions in the MLRT is very flexible. A good choice of the penalty enhances the power of the MLRT. In this thesis, we first introduce a new class of penalty functions, with which the MLRT enjoys a significantly improved power for testing homogeneity. The main contribution of this thesis is to propose a new class of methods for testing homogeneity. Most existing methods in the literature for testing of homogeneity, explicitly or implicitly, are derived under the condition of finite Fisher information and a compactness assumption on the space of the mixing parameters. The finite Fisher information condition can prevent their usage to many important mixture models, such as the mixture of geometric distributions, the mixture of exponential distributions and more generally mixture models in scale distribution families. The compactness assumption often forces applicants to set artificial bounds for the parameters of interest and makes the resulting limiting distribution dependent on these bounds. Consequently, developing a method without such restrictions is a dream of many researchers. As it will be seen, the proposed EM-test in this thesis is free of these shortcomings. The EM-test combines the merits of the classic LRT and score test. The properties of the EM-test are particularly easy to investigate under single parameter mixture models. It has a simple limiting distribution 0.5\chi^2_0+0.5\chi^2_1, the same as the MLRT. This result is applicable to mixture models without requiring the restrictive regularity conditions described earlier. The normal mixture model is a very popular model in applications. However it does not satisfy the strong identifiability condition, which imposes substantial technical difficulties in the study of the asymptotic properties. Most existing methods do not directly apply to the normal mixture models, so the asymptotic properties have to be developed separately. We investigate the use of the EM-test to normal mixture models and its limiting distributions are derived. For the homogeneity test in the presence of the structural parameter, the limiting distribution is a simple function of the 0.5\chi^2_0+0.5\chi^2_1 and \chi^2_1 distributions. The test with this limiting distribution is still very convenient to implement. For normal mixtures in both mean and variance parameters, the limiting distribution of the EM-test is found be to \chi^2_2. Mixture models are also widely used in the analysis of the directional data. The von Mises distribution is often regarded as the circular normal model. Interestingly, it satisfies the strong identifiability condition and the parameter space of the mean direction is compact. However the theoretical results in the single parameter mixture models can not directly apply to the von Mises mixture models. Because of this, we also study the application of the EM-test to von Mises mixture models in the presence of the structural parameter. The limiting distribution of the EM-test is also found to be 0.5\chi^2_0+0.5\chi^2_1. Extensive simulation results are obtained to examine the precision of the approximation of the limiting distributions to the finite sample distributions of the EM-test. The type I errors with the critical values determined by the limiting distributions are found to be close to nominal values. In particular, we also propose several precision enhancing methods, which are found to work well. Real data examples are used to illustrate the use of the EM-test. Finite mixture model Homogeneity test Statistics
2	Hypothesis Testing in Finite Mixture Models Li, Pengfei 11 December 2007 (has links) Mixture models provide a natural framework for unobserved heterogeneity in a population. They are widely applied in astronomy, biology, engineering, finance, genetics, medicine, social sciences, and other areas. An important first step for using mixture models is the test of homogeneity. Before one tries to fit a mixture model, it might be of value to know whether the data arise from a homogeneous or heterogeneous population. If the data are homogeneous, it is not even necessary to go into mixture modeling. The rejection of the homogeneous model may also have scientific implications. For example, in classical statistical genetics, it is often suspected that only a subgroup of patients have a disease gene which is linked to the marker. Detecting the existence of this subgroup amounts to the rejection of a homogeneous null model in favour of a two-component mixture model. This problem has attracted intensive research recently. This thesis makes substantial contributions in this area of research. Due to partial loss of identifiability, classic inference methods such as the likelihood ratio test (LRT) lose their usual elegant statistical properties. The limiting distribution of the LRT often involves complex Gaussian processes, which can be hard to implement in data analysis. The modified likelihood ratio test (MLRT) is found to be a nice alternative of the LRT. It restores the identifiability by introducing a penalty to the log-likelihood function. Under some mild conditions, the limiting distribution of the MLRT is 1/2\chi^2_0+1/2\chi^2_1, where \chi^2_{0} is a point mass at 0. This limiting distribution is convenient to use in real data analysis. The choice of the penalty functions in the MLRT is very flexible. A good choice of the penalty enhances the power of the MLRT. In this thesis, we first introduce a new class of penalty functions, with which the MLRT enjoys a significantly improved power for testing homogeneity. The main contribution of this thesis is to propose a new class of methods for testing homogeneity. Most existing methods in the literature for testing of homogeneity, explicitly or implicitly, are derived under the condition of finite Fisher information and a compactness assumption on the space of the mixing parameters. The finite Fisher information condition can prevent their usage to many important mixture models, such as the mixture of geometric distributions, the mixture of exponential distributions and more generally mixture models in scale distribution families. The compactness assumption often forces applicants to set artificial bounds for the parameters of interest and makes the resulting limiting distribution dependent on these bounds. Consequently, developing a method without such restrictions is a dream of many researchers. As it will be seen, the proposed EM-test in this thesis is free of these shortcomings. The EM-test combines the merits of the classic LRT and score test. The properties of the EM-test are particularly easy to investigate under single parameter mixture models. It has a simple limiting distribution 0.5\chi^2_0+0.5\chi^2_1, the same as the MLRT. This result is applicable to mixture models without requiring the restrictive regularity conditions described earlier. The normal mixture model is a very popular model in applications. However it does not satisfy the strong identifiability condition, which imposes substantial technical difficulties in the study of the asymptotic properties. Most existing methods do not directly apply to the normal mixture models, so the asymptotic properties have to be developed separately. We investigate the use of the EM-test to normal mixture models and its limiting distributions are derived. For the homogeneity test in the presence of the structural parameter, the limiting distribution is a simple function of the 0.5\chi^2_0+0.5\chi^2_1 and \chi^2_1 distributions. The test with this limiting distribution is still very convenient to implement. For normal mixtures in both mean and variance parameters, the limiting distribution of the EM-test is found be to \chi^2_2. Mixture models are also widely used in the analysis of the directional data. The von Mises distribution is often regarded as the circular normal model. Interestingly, it satisfies the strong identifiability condition and the parameter space of the mean direction is compact. However the theoretical results in the single parameter mixture models can not directly apply to the von Mises mixture models. Because of this, we also study the application of the EM-test to von Mises mixture models in the presence of the structural parameter. The limiting distribution of the EM-test is also found to be 0.5\chi^2_0+0.5\chi^2_1. Extensive simulation results are obtained to examine the precision of the approximation of the limiting distributions to the finite sample distributions of the EM-test. The type I errors with the critical values determined by the limiting distributions are found to be close to nominal values. In particular, we also propose several precision enhancing methods, which are found to work well. Real data examples are used to illustrate the use of the EM-test. Finite mixture model Homogeneity test Statistics
3	Homogeneity Test on Error Rates from Ordinal Scores and Application to Forensic Science Nguyen, Ngoc Ty 01 January 2023 (has links) (PDF) The Receiver Operating Characteristic (ROC) curve is used to measure the classification accuracy of tests that yield ordinal or continuous scores. Ordinal scores are common in medical imaging studies and, more recently, in black-box studies on forensic identification accuracy (Phillips et al., 2018). To assess the accuracy of radiologists in medical imaging studies or the accuracy of forensic examiners in biometric studies, one needs to estimate the ROC curves from the ordinal scores and account for the covariates related to the radiologists or forensic examiners. In this thesis, we propose a homogeneity test to compare the performance of raters. We derive the asymptotic properties of estimated ROC curves and their corresponding Area Under the Curve (AUC) within an ordinal regression framework. Moreover, we investigate differences in ROC curves (and AUCs) among examiners in detail. We construct confidence intervals for the difference in AUCs and confidence bands for the difference in ROC curves for performance comparison purposes. First, we conduct simulations on data where scores are assumed to be normally distributed, and the features include both categorical and continuous covariates. Then, we apply our procedure to facial recognition data to compare forensic examiners. The second part of this thesis addresses the correlation of decision scores among raters. In medical imaging studies and facial recognition, multiple raters assess the same subject pairs, leading to potential score correlations. Because of these correlated scores, standard methods for generalized linear models cannot be directly applied to estimate accuracy. In this thesis, we employ the generalized estimating equation to estimate covariate-specific and covariate-adjusted AUC values when correlations are present in ordinal scores. We conduct homogeneity tests on both covariate-specific and covariate-adjusted AUCs, investigating their statistical properties. To assess the finite sample properties of the test, we conduct simulation studies. Furthermore, we apply this test to real facial recognition data. Homogeneity Test ROC Curve AUC Face Recognition Covariate Correlation Applied Mathematics Biology Pathology
4	Development in Normal Mixture and Mixture of Experts Modeling Qi, Meng 01 January 2016 (has links) In this dissertation, first we consider the problem of testing homogeneity and order in a contaminated normal model, when the data is correlated under some known covariance structure. To address this problem, we developed a moment based homogeneity and order test, and design weights for test statistics to increase power for homogeneity test. We applied our test to microarray about Down’s syndrome. This dissertation also studies a singular Bayesian information criterion (sBIC) for a bivariate hierarchical mixture model with varying weights, and develops a new data dependent information criterion (sFLIC).We apply our model and criteria to birth- weight and gestational age data for the same model, whose purposes are to select model complexity from data. Finite Mixture Models Micro-array Analysis Homogeneity Test In- formation Criterion Hierarchical Mixture Model Microarrays Multivariate Analysis Statistical Methodology Statistical Models
5	Untersuchung zur prädiktiven Validität von Konzentrationstests: Ein chronometrischer Ansatz zur Überprüfung der Rolle von Itemschwierigkeit, Testlänge, und Testdiversifikation Schumann, Frank 06 June 2016 (has links) In der hier vorliegenden Arbeit wurde die Validität von Aufmerksamkeits- und Konzentrationstests untersucht. Im Vordergrund stand dabei die Frage nach dem Einfluss verschiedener kritischer Variablen auf die prädiktive Validität in diesen Tests, insbesondere der Itemschwierigkeit und Itemhomogenität, der Testlänge bzw. des Testverlaufs, der Testdiversifikation und der Validität im Kontext einer echten Personalauslese. In insgesamt fünf Studien wurden die genannten Variablen systematisch variiert und auf ihre prädiktive Validität zur (retrograden und konkurrenten) Vorhersage von schulischen und akademischen Leistungen (Realschule, Abitur, Vordiplom/Bachelor) hin analysiert. Aufgrund der studentischen (d. h. relativ leistungshomogenen) Stichprobe bestand die Erwartung, dass die Korrelationen etwas unterschätzt werden. Da die Validität in dieser Arbeit jedoch „vergleichend“ für bestimmte Tests bzw. experimentelle Bedingungen bestimmt wurde, sollte dies keine Rolle spielen. In Studie 1 (N = 106) wurde zunächst untersucht, wie schwierig die Items in einem Rechenkonzentrationstest sein sollten, um gute Vorhersagen zu gewährleisten. Dazu wurden leichte und schwierigere Items vergleichend auf ihre Korrelation zum Kriterium hin untersucht. Im Ergebnis waren sowohl leichte als auch schwierigere Testvarianten ungefähr gleich prädiktiv. In Studie 2 (N = 103) wurde die Rolle der Testlänge untersucht, wobei die prädiktive Validität von Kurzversion und Langversion in einem Rechenkonzentrationstest vergleichend untersucht wurde. Im Ergebnis zeigte sich, dass die Kurzversion valider war als die Langversion und dass die Validität in der Langversion im Verlauf abnimmt. In Studie 3 (N = 388) stand der Aspekt der Testdiversifikation im Vordergrund, wobei untersucht wurde, ob Intelligenz besser mit einem einzelnen Matrizentest (Wiener Matrizen-Test, WMT) oder mit einer Testbatterie (Intelligenz-Struktur-Test, I-S-T 2000 R) erfasst werden sollte, um gute prädiktive Validität zu gewährleisten. Die Ergebnisse sprechen klar für den Matrizentest, welcher ungefähr gleich valide war wie die Testbatterie, aber dafür testökonomischer ist. In den Studien 4 (N = 105) und 5 (N =97) wurde die prädiktive Validität zur Vorhersage von Schulleistungen im Kontext einer realen Personalauswahlsituation untersucht. Während die großen Testbatterien, Wilde-Intelligenz-Test 2 (WIT-2) und Intelligenz-Struktur-Test 2000R (I-S-T 2000 R), nur mäßig gut vorhersagen konnten, war der Komplexe Konzentrationstest (KKT), insbesondere der KKT-Rechentest ein hervorragender Prädiktor für schulische und akademische Leistungen. Auf Basis dieser Befunde wurden schließlich Empfehlungen und Anwendungshilfen für den strategischen Einsatz von Testinstrumenten in der diagnostischen Berufspraxis ausgesprochen.:1 Einführung und Ziele 2 Diagnostik von Konzentrationsfähigkeit 2.1 Historische Einordnung 2.2 Kognitive Modellierung 2.3 Psychometrische Modellierung 3 Prädiktive Validität von Konzentrationstests 3.1 Reliabilität, Konstruktvalidität, Kriterienvalidität 3.2 Konstruktions- und Validierungsstrategien 3.3 Ableitung der Fragestellung 4 Beschreibung der Fragebögen und Tests 5 Empirischer Teil 5.1 Studie 1 - Itemschwierigkeit 5.1.1 Methode 5.1.2 Ergebnisse 5.1.3 Diskussion 5.2 Studie 2 - Testverlängerung und Testverlauf 5.2.1 Methode 5.2.2 Ergebnisse 5.2.3 Diskussion 5.3 Studie 3 - Testdiversifikation 5.3.1 Methode 5.3.2 Ergebnisse 5.3.3 Diskussion 5.4 Studie 4 - Validität in realer Auswahlsituation (I-S-T 2000 R) 5.4.1 Methode 5.4.2 Ergebnisse 5.4.3 Diskussion 5.5 Studie 5 - Validität in realer Auswahlsituation (WIT-2) 5.5.1 Methode 5.5.2 Ergebnisse 5.5.3 Diskussion 6 Diskussion 128 6.1 Sind schwierige Tests besser als leichte Tests? 6.2 Sind lange Tests besser als kurze Tests? 6.3 Sind Testbatterien besser als Einzeltests? 6.4 Sind Tests auch unter „realen“ Bedingungen valide? 6.5 Validität unter realen Bedingungen - Generalisierung 7 Theoretische Implikationen 8 Praktische Konsequenzen 9 Literaturverzeichnis Anhang info:eu-repo/classification/ddc/150 ddc:150
6	Development of a flood-frequency model for the river basins of the Central Region of Malawi as a tool for engineering design and disaster preparedness in flood-prone areas Laisi, Elton 02 1900 (has links) Since 1971, a number of flood frequency models have been developed for river basins in Malawi for use in the design of hydraulic structures, but the varied nature of their results have most often given a dilemma to the design engineer due to differences in magnitudes of calculated floods for given return periods. All the known methods for flood frequency analysis developed in country so far have not used a homogeneity test for the river basins from which the hydrological data has been obtained. This study was thus conducted with a view to resolving this problem and hence improve the design of hydraulic structures such as culverts, bridges, water intake points for irrigation schemes, and flood protection dykes. In light of the above, during the course of this study the applicability of existing methods in the design of hydraulic structures was assessed. Also, the study investigated how land use and land cover change influence the frequency and magnitude of floods in the study area, and how their deleterious impacts on the socio-economic and natural environment in the river basins could be mitigated / Environmental Sciences / M. Sc. (Environmental Management) Flood frequency Homogeneity test Hydraulic structures Land use and land cover change Return period 627.4096897

1

Page generated in 0.0529 seconds