• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 113
  • 10
  • Tagged with
  • 123
  • 123
  • 24
  • 21
  • 19
  • 14
  • 13
  • 12
  • 12
  • 12
  • 10
  • 10
  • 10
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

A study of the calibration-inverse prediction problem in a mixed model setting

Yang, Celeste January 1900 (has links)
Master of Science / Department of Statistics / Paul I. Nelson / The Calibration-Inverse Prediction Problem was investigated in a mixed model setting. Two methods were used to construct inverse prediction intervals. Method 1 ignores the random block effect in the mixed model and constructs the inverse prediction interval in the standard manner using quantiles from an F distribution. Method 2 uses a bootstrap to estimate quantiles of an approximate pivotal and then follows essentially the same procedure as in method 1. A simulation study was carried out to compare how the intervals created by the two methods performed in terms of coverage rate and mean interval length. Results from our simulation study suggest that when the variance component of the block is large relative to the location variance component, the coverage rate of the intervals produced by the two methods differ significantly. Method 2 appears to yield intervals which have a slightly higher coverage rate and wider interval length then did method 1. Both methods yielded intervals with coverage rates below nominal for approximately 1/3 of the simulation settings.
42

Tests for unequal treatment variances in crossover designs

Jung, Yoonsung January 1900 (has links)
Doctor of Philosophy / Department of Statistics / John E. Boyer Jr., Dallas E. Johnson / A crossover design is an experimental design in which each experimental unit receives a series of experimental treatments over time. The order that an experimental unit receives its treatments is called a sequence (example, the sequence AB means that treatment A is given first, and then followed by treatment B). A period is the time interval during which a treatment is administered to the experimental unit. A period could range from a few minutes to several months depending on the study. Sequences usually involve subjects receiving a different treatment in each successive period. However, treatments may occur more than once in any sequence (example, ABAB). Treatments and periods are compared within subjects, i.e. each subject serves as his/her own control. Therefore, any effect that is related to subject differences is removed from treatment and period comparisons. Carryover effects are residual effects from a previous treatment manifesting themselves in subsequent periods. Crossover designs both with and without carryover are traditionally analyzed assuming that the response due to different treatments have equal variances. The effects of unequal variances on traditional tests for treatment and carryover difference were recently considered in crossover designs assuming that the response due to treatments have unequal variances with a compound symmetry correlation structure. The likelihood function for the two treatment/two sequence crossover design has closed form maximum likelihood solutions for the parameters at both the null hypothesis, H0 : sigma_A^2 =\sigma_B^2, and at alternative hypothesis, HA : not H0. Under HA : not H0, the method of moment estimators and the maximum likelihood estimators of sigma_A^2,sigma_B^2 and rho are identical. The dual balanced design, ABA=BAB, which is balanced for carryover effects is also considered. The dual balanced design has a closed form solution that maximizes the likelihood function under the null hypothesis, H0 :sigma_A^2=sigma_B^2, but not for the alternative hypothesis, HA : not H0. Similarly, the three treatment/three sequence crossover design, ABC=BCA=CAB, has a closed form solution that maximizes the likelihood function at the null hypothesis, H0 : sigma_A^2=sigma_B^2 = sigma_C^2, but not for the alternative hypothesis, HA : not H0. An iterative procedure is introduced to estimate the parameters for the two and three treatment crossover designs. To check the performance of the likelihood ratio tests, Type I error rates and power comparisons are explored using simulations.
43

Multiple-trait multiple-interval mapping of quantitative-trait loci

Joehanes, Roby January 1900 (has links)
Master of Science / Department of Statistics / Gary L. Gadbury / QTL (quantitative-trait locus) analysis aims to locate and estimate the effects of genes that are responsible for quantitative traits, such as grain protein content and yield, by means of statistical methods that evaluate the association of genetic variation with trait (phenotypic) variation. Quantitative traits are typically polygenic, i.e., controlled by multiple genes, with varying degrees of in uence on the phenotype. Several methods have been developed to increase the accuracy of QTL location and effect estimates. One of them, multiple interval mapping (MIM) (Kao et al. 1999), has been shown to be more accurate than conventional methods such as composite interval mapping (CIM) (Zeng 1994). Other QTL analysis methods have been developed to perform additional analyses that might be useful for breeders, such as of pleiotropy and QTL-by-environment (QxE) interaction. It has been shown (Jiang and Zeng 1995) that these analyses can be carried out with a multivariate extension of CIM (MT-CIM) that exploits the correlation structure in a set of traits. In doing so, this method also improves the accuracy of QTL location detection. This thesis describes the multivariate extension of MIM (MT-MIM) using ideas from MT-CIM. The development of additional multivariate tests, such as of pleiotropy and QxE interaction, and several methods pertinent to the development of MT-MIM are also described. A small simulation study shows that MT-MIM is more accurate than MT-CIM and univariate MIM. Results for real data show that MT-MIM is able to provide a more accurate and precise estimate of QTL location.
44

A simulation study of the robustness of Hotelling’s T2 test for the mean of a multivariate distribution when sampling from a multivariate skew-normal distribution

Wu, Yun January 1900 (has links)
Master of Science / Department of Statistics / Paul I. Nelson / Hotelling’s T2 test is the standard tool for inference about the mean of a multivariate normal population. However, this test may perform poorly when used on samples from multivariate distributions with highly skewed marginal distributions. The goal of our study was to investigate the type I error rate and power properties of Hotelling’s one sample test when sampling from a class of multivariate skew-normal (SN) distributions, which includes the multivariate normal distribution and, in addition to location and scale parameters, has a shape parameter to regulate skewness. Simulation results of tests carried out at nominal type I error rate 0.05 obtained from various levels of shape parameters, sample sizes, number of variables and fixed correlation matrix showed that Hotelling’s one sample test provides adequate control of type I error rates over the entire range of conditions studied. The test also produces suitable power levels for detecting departures from hypothesized values of a multivariate mean vector when data result from a random sample from a multivariate SN. The shape parameter of the SN family appears not to have much of an effect on the robustness of Hotelling’s test. However, surprisingly, it does have a positive impact on power.
45

An exploratory method for identifying reactant-product lipid pairs from lipidomic profiles of wild-type and mutant leaves of Arabidopsis thaliana

Fan, Lixia January 1900 (has links)
Master of Science / Department of Statistics / Gary L. Gadbury / Discerning the metabolic or enzymatic role of a particular gene product, in the absence of information indicating sequence homology to known gene products, is a difficult task. One approach is to compare the levels of metabolites in a wild-type organism to those in an organism with a mutation that causes loss of function of the gene. The goal of this project was to develop an approach to analyze metabolite data on wild-type and mutant organisms for the purpose of identifying the function of a mutated gene. To develop and test statistical approaches to analysis of metabolite data for identification of gene function, levels of 141 lipid metabolites were measured in leaves of wild-type Arabidopsis thaliana plants and in leaves of Arabidopsis thaliana plants with known mutations in genes involved in lipid metabolism. The mutations were primarily in fatty acid desaturases, which are enzymes that catalyze reactions in which double bonds are added to fatty acids. When these enzymes are mutated, leaf lipid composition is altered, and the altered levels of specific lipid metabolites can be detected by a mass spectrometry. A randomization P-Value and other metrics were calculated for all potential reactant product pairs, which included all lipid metabolite pairs. An algorithm was developed to combine these data and rank the results for each pair as to likelihood of being the actual reactant-product pair. This method was designed and tested on data collected on mutants in genes with known functions, fad2 (Okuley et al., 1994), fad3 (Arondel et al., 1992), fad4, fad5 (Mekhedov et al., 2000), fad6 (Falcone et al., 1994), and fad7 (Iba et al., 1993 and Gibson et al., 1994). Application of the method to three additional genes produced by random mutagenesis, sfd1, sfd2, and sfd3, indicated that the significant pairs for fad6 and sfd3 were similar. Consistent with this, genetic evidence has indicated that sfd3 is a mutation in the FAD6 gene. The methods provide a list of putative reactions for an enzyme encoded by an unknown mutant gene. The output lists for unknown genes and known genes can be compared to provide evidence for similar biochemical activities. However, the strength of the current method is that the list of candidate chemical reactions for an enzyme encoded by a mutant gene can be produced without data other than the metabolite profile of the wild-type and mutant organisms, i.e., known gene analysis is not a requirement to obtain the candidate reaction list.
46

A simulation study of the size and power of Cochran’s Q versus the standard Chi-square test for testing the equality of correlated proportions

Gayle, Suelen S. January 1900 (has links)
Master of Science / Department of Statistics / Paul I. Nelson / The standard Chi-square test for the equality of proportions of positive responses to c specified binary questions is valid when the observed responses arise from independent random samples of units. When the responses to all c questions are recorded on the same unit, a situation called correlated proportions, the assumptions under which this test is derived are no longer valid. Under the additional assumption of compound symmetry, the Cochran-Q test is a valid test for the equality of proportions of positive responses. The purpose of this report is to use simulation to examine and compare the performance of the Cochran-Q test and the standard Chisquare test when testing for the equality of correlated proportions. It is found that the Cochran-Q test is superior to the Chi-square test in terms of size and power, especially when the common correlation among the binary responses is large.
47

A comparison of type I error and power of the aligned rank method using means and medians for alignment

Yates, Heath Landon January 1900 (has links)
Master of Science / Department of Statistics / James J. Higgins / A simulation study was done to compare the Type I error and power of standard analysis of variance (ANOVA), the aligned rank transform procedure (ART), and the aligned rank transform procedure where alignment is done using medians (ART + Median). The methods were compared in the context of a balanced two-way factorial design with interaction when errors have a normal distribution and outliers are present in the data and when errors have the Cauchy distribution. The simulation results suggest that the nonparametric methods are more outlier-resistant and valid when errors have heavy tails in comparison to ANOVA. The ART + Median method appears to provide greater resistance to outliers and is less affected by heavy-tailed distributions than the ART method and ANOVA.
48

The performance and robustness of confidence intervals for the median of a symmetric distribution constructed assuming sampling from a Cauchy distribution

Cao, Jennifer Yue January 1900 (has links)
Master of Science / Department of Statistics / Paul Nelson / Trimmed means are robust estimators of location for distributions having heavy tails. Theory and simulation indicate that little efficiency is lost under normality when using appropriately trimmed means and that their use with data from distributions with heavy tails can result in improved performance. This report uses the principle of equivariance applied to trimmed means sampled from a Cauchy distribution to form a discrepancy function of the data and parameters whose distribution is free of the unknown median and scale parameter. Quantiles of this discrepancy function are estimated via asymptotic normality and simulation and used to construct confidence intervals for the median of a Cauchy distribution. A nonparametric approach based on the distribution of order statistics is also used to construct confidence intervals. The performance of these intervals in terms of coverage rate and average length is investigated via simulation when the data are actually sampled from a Cauchy distribution and when sampling is from normal and logistic distributions. The intervals based on simulation estimation of the quantiles of the discrepancy function are shown to perform well across a range of sample sizes and trimming proportions when the data are actually sampled from a Cauchy distribution and to be relatively robust when sampling is from the normal and logistic distributions.
49

Semiparametric mixture models

Xiang, Sijia January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Weixin Yao / This dissertation consists of three parts that are related to semiparametric mixture models. In Part I, we construct the minimum profile Hellinger distance (MPHD) estimator for a class of semiparametric mixture models where one component has known distribution with possibly unknown parameters while the other component density and the mixing proportion are unknown. Such semiparametric mixture models have been often used in biology and the sequential clustering algorithm. In Part II, we propose a new class of semiparametric mixture of regression models, where the mixing proportions and variances are constants, but the component regression functions are smooth functions of a covariate. A one-step backfitting estimate and two EM-type algorithms have been proposed to achieve the optimal convergence rate for both the global parameters and nonparametric regression functions. We derive the asymptotic property of the proposed estimates and show that both proposed EM-type algorithms preserve the asymptotic ascent property. In Part III, we apply the idea of single-index model to the mixture of regression models and propose three new classes of models: the mixture of single-index models (MSIM), the mixture of regression models with varying single-index proportions (MRSIP), and the mixture of regression models with varying single-index proportions and variances (MRSIPV). Backfitting estimates and the corresponding algorithms have been proposed for the new models to achieve the optimal convergence rate for both the parameters and the nonparametric functions. We show that the nonparametric functions can be estimated as if the parameters were known and the parameters can be estimated with the same rate of convergence, n[subscript](-1/2), that is achieved in a parametric model.
50

Using Markov chain to describe the progression of chronic disease

Davis, Sijia January 1900 (has links)
Master of Science / Department of Statistics / Abigail Jager / A discrete-time Markov chain with stationary transition probabilities is often used for the purpose of investigating treatment programs and health care protocols for chronic disease. Suppose the patients of a certain chronic disease are observed over equally spaced time intervals. If we classify the chronic disease into n distinct health states, the movement through these health states over time then represents a patient’s disease history. We can use a discrete-time Markov chain to describe such movement using the transition probabilities between the health states. The purpose of this study was to investigate the case when the observation interval coincided with the cycle length of the Markov chain as well as the case when the observational interval and the cycle length did not coincide. In particular, we are interested in how the estimated transition matrix behaves as the ratio of observation interval and cycle length changes. Our results suggest that more estimation problems arose for small sample sizes as the length of observational interval increased, and that the deviation from the known transition probability matrix got larger as the length of observational interval increased. With increasing sample size, there were fewer estimation problems and the deviation from the known transition probability matrix was reduced.

Page generated in 0.0271 seconds