Global ETD Search

21	A Penalized Approach to Mixed Model Selection Via Cross Validation Xiong, Jingwei 05 December 2017 (has links) No description available. Statistics linear mixed models penalized approaches variable selection cross validation
22	Adaptive LASSO For Mixed Model Selection via Profile Log-Likelihood Pan, Juming 18 July 2016 (has links) No description available. Statistics model selection linear mixed models oracle properties
23	Bayesian variable selection for linear mixed models when p is much larger than n with applications in genome wide association studies Williams, Jacob Robert Michael 05 June 2023 (has links) Genome-wide association studies (GWAS) seek to identify single nucleotide polymorphisms (SNP) causing phenotypic responses in individuals. Commonly, GWAS analyses are done by using single marker association testing (SMA) which investigates the effect of a single SNP at a time and selects a candidate set of SNPs using a strict multiple correction penalty. As SNPs are not independent but instead strongly correlated, SMA methods lead to such high false discovery rates (FDR) that the results are difficult to use by wet lab scientists. To address this, this dissertation proposes three different novel Bayesian methods: BICOSS, BGWAS, and IEB. From a Bayesian modeling point of view, SNP search can be seen as a variable selection problem in linear mixed models (LMMs) where $p$ is much larger than $n$. To deal with the $p>>n$ issue, our three proposed methods use novel Bayesian approaches based on two steps: a screening step and a model selection step. To control false discoveries, we link the screening and model selection steps through a common probability of a null SNP. To deal with model selection, we propose novel priors that are extensions for LMMs of nonlocal priors, Zellner-g prior, unit Information prior, and Zellner-Siow prior. For each method, extensive simulation studies and case studies show that these methods improve the recall of true causal SNPs and, more importantly, drastically decrease FDR. Because our Bayesian methods provide more focused and precise results, they may speed up discovery of important SNPs and significantly contribute to scientific progress in the areas of biology, agricultural productivity, and human health. / Doctor of Philosophy / Genome-wide association studies (GWAS) seek to identify locations in DNA known as single nucleotide polymorphisms (SNPs) that are the underlying cause of observable traits such as height or breast cancer. Commonly, GWAS analyses are performed by investigating each SNP individually and seeing which SNPs are highly correlated with the response. However, as the SNPs themselves are highly correlated, investigating each one individually leads to a high number of false positives. To address this, this dissertation proposes three different advanced statistical methods: BICOSS, BGWAS, and IEB. Through extensive simulations, our methods are shown to not only drastically reduce the number of falsely detected SNPs but also increase the detection rate of true causal SNPs. Because our novel methods provide more focused and precise results, they may speed up discovery of important SNPs and significantly contribute to scientific progress in the areas of biology, agricultural productivity, and human health. Bayesian methods GWAS Linear Mixed Models Model Selection
24	The Illusion That Is Multiplayer Games : Position disparities in Client-serverstructured multiplayer games Carlsson, Robert January 2014 (has links) The goal of this study is to research the disparities in character positions between clients and server when playing an online game. The data needed was gathered by letting three players play a game made by me against each other, using extrapolation methods like the Kalman Filter on the characters’. During the play-through each client saved all characters positions together with the input made by the players. The clients logged the information every network update, in synch with the server. When the time came, all clients sent their information to the server, where it was collected, analyzed and compared with the information the server had registered. By calculating the difference in position of the server and clients characters, a disparity value could be extracted. This value is what was used to calculate a disparity value between the server characters and all clients’ counterparts. The same value is also what was used to answer the questions on how much impact the different extrapolation methods have on a game, as well as how big of an impact input made have on the delay of the game. An important part of the study was to make sure that the information gathered was collected at the same time on the clients and the server, as well as to be able to enable and disable parts of the methods. Therefore the whole game used in this dissertation was built focused on this study. All extrapolation methods are toggle-able and the information gathered is synched using time.windows.com. Player Characters Network deficiencies Server - Client communication Mixed models Position Disparities Spelarkaraktärer Nätverksbrister Server - Klient kommunikation Mixed models Positionsskillnader
25	Treatment heterogeneity and potential outcomes in linear mixed effects models Richardson, Troy E. January 1900 (has links) Doctor of Philosophy / Department of Statistics / Gary L. Gadbury / Studies commonly focus on estimating a mean treatment effect in a population. However, in some applications the variability of treatment effects across individual units may help to characterize the overall effect of a treatment across the population. Consider a set of treatments, {T,C}, where T denotes some treatment that might be applied to an experimental unit and C denotes a control. For each of N experimental units, the duplet {r[subscript]i, r[subscript]Ci}, i=1,2,…,N, represents the potential response of the i[superscript]th experimental unit if treatment were applied and the response of the experimental unit if control were applied, respectively. The causal effect of T compared to C is the difference between the two potential responses, r[subscript]Ti- r[subscript]Ci. Much work has been done to elucidate the statistical properties of a causal effect, given a set of particular assumptions. Gadbury and others have reported on this for some simple designs and primarily focused on finite population randomization based inference. When designs become more complicated, the randomization based approach becomes increasingly difficult. Since linear mixed effects models are particularly useful for modeling data from complex designs, their role in modeling treatment heterogeneity is investigated. It is shown that an individual treatment effect can be conceptualized as a linear combination of fixed treatment effects and random effects. The random effects are assumed to have variance components specified in a mixed effects “potential outcomes” model when both potential outcomes, r[subscript]T,r[subscript]C, are variables in the model. The variance of the individual causal effect is used to quantify treatment heterogeneity. Post treatment assignment, however, only one of the two potential outcomes is observable for a unit. It is then shown that the variance component for treatment heterogeneity becomes non-estimable in an analysis of observed data. Furthermore, estimable variance components in the observed data model are demonstrated to arise from linear combinations of the non-estimable variance components in the potential outcomes model. Mixed effects models are considered in context of a particular design in an effort to illuminate the loss of information incurred when moving from a potential outcomes framework to an observed data analysis. Causal inference Counterfactual Generalized linear mixed models Subject-treatment interaction What would Fisher do Statistics (0463)
26	Robust mixtures of regression models Bai, Xiuqin January 1900 (has links) Doctor of Philosophy / Department of Statistics / Kun Chen and Weixin Yao / This proposal contains two projects that are related to robust mixture models. In the robust project, we propose a new robust mixture of regression models (Bai et al., 2012). The existing methods for tting mixture regression models assume a normal distribution for error and then estimate the regression param- eters by the maximum likelihood estimate (MLE). In this project, we demonstrate that the MLE, like the least squares estimate, is sensitive to outliers and heavy-tailed error distributions. We propose a robust estimation procedure and an EM-type algorithm to estimate the mixture regression models. Using a Monte Carlo simulation study, we demonstrate that the proposed new estimation method is robust and works much better than the MLE when there are outliers or the error distribution has heavy tails. In addition, the proposed robust method works comparably to the MLE when there are no outliers and the error is normal. In the second project, we propose a new robust mixture of linear mixed-effects models. The traditional mixture model with multiple linear mixed effects, assuming Gaussian distribution for random and error parts, is sensitive to outliers. We will propose a mixture of multiple linear mixed t-distributions to robustify the estimation procedure. An EM algorithm is provided to and the MLE under the assumption of t- distributions for error terms and random mixed effects. Furthermore, we propose to adaptively choose the degrees of freedom for the t-distribution using profile likelihood. In the simulation study, we demonstrate that our proposed model works comparably to the traditional estimation method when there are no outliers and the errors and random mixed effects are normally distributed, but works much better if there are outliers or the distributions of the errors and random mixed effects have heavy tails. Least square estimation EM algorithm Linear mixed models Mixture models Multivariate distribution Robust estimation Statistics (0463)
27	Using Linear Mixed Models to Analyze Native and Non-Native Species Abundances in Coastal Sage Scrub anderson, kaylee 01 January 2016 (has links) Coastal Sage Scrub (CSS) is a low scrubland plant community native to the coasts of California, housing many threatened and endangered species. Due to the invasion of non-native plants, many areas of CSS have type converted to annual grasslands and the fire frequency has accelerated; fire in turn, may facilitate further invasion, leading to a loss of biodiversity. While many studies document post-fire succession in these communities, pre-fire data are rarely available for comparison, especially data on seedling emergence. I analyzed post-fire recovery of a type-converted grassland community, comparing seedling emergence data for the first and third year after fire to the three years preceding the fire. Non-native species abundances declined more after the fire than did native abundances. This pattern was still present in 2015, three years post-fire. Two native species, Amsinckia menziesii var. intermedia (Amsinckia) and Phacelia distans (Phacelia), were subjects of seed addition treatments pre-fire, but I found no evidence that past seeding increased their abundances post-fire. Amsincki did recover to its pre-fire density three years after the fire, while the density of Phacelia declined over 75% in both the year immediately post-fire and three years after the fire. However, a third native species, Lupinus bicolor (Lupinus), was both much more abundant and also more spatially widespread both immediately after the fire and two years later. This supports the hypothesis that Lupinus is stored in the soil seed bank and the fire may have given this species the opportunity to recover by lowering abundances of non-native competitors. This analysis will inform future conservation efforts by improving our understanding of how seed banks impact the post-fire recovery of native species. Coastal Sage Scrub Conservation Biology Type-Converted Grasslands Invasive species Mixed Models Biodiversity Biology Plant Sciences
28	Disfluency as ... er ... delay : an investigation into the immediate and lasting consequences of disfluency and temporal delay using EEG and mixed-effects modelling Bouwsema, Jennifer A. E. January 2014 (has links) Difficulties in speech production are often marked by disfluency; fillers, hesitations, prolongations, repetitions and repairs. In recent years a body of work has emerged that demonstrates that listeners are sensitive to disfluency, and that this affects their expectations for upcoming speech, as well as their attention to the speech stream. This thesis investigates the extent to which delay may be responsible for triggering these effects. The experiments reported in this thesis build on an Event Related Potential (ERP) paradigm developed by Corley et al., (2007), in which participants listened to sentences manipulated by both fluency and predictability. Corley et al. reported an attenuated N400 effect for words following disfluent ers, and interpreted this as indicating that the extent to which listeners made predictions was reduced following an er. In the current set of experiments, various noisy interruptions were added to Corley et al.,'s paradigm, time matched to the disfluent fillers. These manipulations allowed investigation of whether the same effects could be triggered by delay alone, in the absence of a cue indicating that the speaker was experiencing difficulty. The first experiment, which contrasted disfluent ers with artificial beeps, revealed a small but significant reduction in N400 effect amplitude for words affected by ers but not by beeps. The second experiment, in which ers were contrasted with speaker generated coughs, revealed no fluency effects on the N400 effect. A third experiment combined the designs of Experiments 1 and 2 to verify whether the difference between them could be characterised as a context effect; one potential explanation for the difference between the outcomes of Experiments 1 and 2 is that the interpretation of an er is affected by the surrounding stimuli. However, in Experiment 3, once again no effect of fluency on the magnitude of the N400 effect was found. Taken together, the results of these three studies lead to the question of whether er's attenuation effect on the N400 is robust. In a second part to each study, listeners took part in a surprise recognition memory test, comprising words which had been the critical words in the previous task intermixed with new words which had not appeared anywhere in the sentences previously heard. Participants were significantly more successful at recognising words which had been unpredictable in their contexts, and, importantly, for Experiments 1 and 2, were significantly more successful at recognising words which had featured in disfluent or interrupted sentences. There was no difference between the recognition rates of words which had been disfluent and those which were affected by a noisy interruption. Collard et al., (2008) demonstrated that disfluency could raise attention to the speech stream, and the finding that interrupted words are equally well remembered leads to the suggestion that any noisy interruption can raise attention. Overall, the finding of memory benefits in response to disfluency, in the absence of attenuated N400 effects leads to the suggestion that different elements of disfluencies may be responsible for triggering these effects. The studies in this thesis also extend previous work by being designed to yield enough trials in the memory test portion of each experiment to permit ERP analysis of the memory data. Whilst clear ERP memory effects remained elusive, important progress was made in that memory ERPs were generated from a disfluency paradigm, and this provided a testing ground on which to demonstrate the use of linear mixed-effects modelling as an alternative to ANOVA analysis for ERPs. Mixed-effects models allow the analysis of unbalanced datasets, such as those generated in many memory experiments. Additionally, we demonstrate the ability to include crossed random effects for subjects and items, and when this is applied to the ERPs from the listening section of Experiment 1, the effect of fluency on N400 amplitude is no longer significant. Taken together, the results from the studies reported in this thesis suggest that temporal delay or disruption in speech can trigger raised attention, but do not necessarily trigger changes in listeners' expectations. 616.85
29	NONLINEAR MODELS IN MULTIVARIATE POPULATION BIOEQUIVALENCE TESTING Dahman, Bassam 17 November 2009 (has links) In this dissertation a methodology is proposed for simultaneously evaluating the population bioequivalence (PBE) of a generic drug to a pre-licensed drug, or the bioequivalence of two formulations of a drug using multiple correlated pharmacokinetic metrics. The univariate criterion that is accepted by the food and drug administration (FDA) for testing population bioequivalence is generalized. Very few approaches for testing multivariate extensions of PBE have appeared in the literature. One method uses the trace of the covariance matrix as a measure of total variability, and another uses a pooled variance instead of the reference variance. The former ignores the correlation between the measurements while the later is not equivalent to the criterion proposed by the FDA in the univariate case, unless the variances of the test and reference are identical, which reduces the PBE to the average bioequivalence. The confidence interval approach is used to test the multivariate population bioequivalence by using a parametric bootstrap method to evaluate the 100% (1-alpha) confidence interval. The performance of the multivariate criterion is evaluated by a simulation study. The size and power of testing for bioequivalence using this multivariate criterion are evaluated in a simulation study by altering the mean differences, the variances, correlations between pharmacokinetic variables and sample size. A comparison between the two published approaches and the proposed criterion is demonstrated. Using nonlinear models and nonlinear mixed effects models, the multivariate population bioequivalence is examined. Finally, the proposed methods are illustrated by simultaneously testing the population bioequivalence for AUC and Cmax in two datasets. Bioequivalence Nonlinear Mixed Models Population Bioequivalence Multivariate Biostatistics Physical Sciences and Mathematics Statistics and Probability
30	An Empirical Approach to Evaluating Sufficient Similarity: Utilization of Euclidean Distance As A Similarity Measure Marshall, Scott 27 May 2010 (has links) Individuals are exposed to chemical mixtures while carrying out everyday tasks, with unknown risk associated with exposure. Given the number of resulting mixtures it is not economically feasible to identify or characterize all possible mixtures. When complete dose-response data are not available on a (candidate) mixture of concern, EPA guidelines define a similar mixture based on chemical composition, component proportions and expert biological judgment (EPA, 1986, 2000). Current work in this literature is by Feder et al. (2009), evaluating sufficient similarity in exposure to disinfection by-products of water purification using multivariate statistical techniques and traditional hypothesis testing. The work of Stork et al. (2008) introduced the idea of sufficient similarity in dose-response (making a connection between exposure and effect). They developed methods to evaluate sufficient similarity of a fully characterized reference mixture, with dose-response data available, and a candidate mixture with only mixing proportions available. A limitation of the approach is that the two mixtures must contain the same components. It is of interest to determine whether a fully characterized reference mixture (representative of the random process) is sufficiently similar in dose-response to a candidate mixture resulting from a random process. Four similarity measures based on Euclidean distance are developed to aid in the evaluation of sufficient similarity in dose-response, allowing for mixtures to be subsets of each other. If a reference and candidate mixture are concluded to be sufficiently similar in dose-response, inference about the candidate mixture can be based on the reference mixture. An example is presented demonstrating that the benchmark dose (BMD) of the reference mixture can be used as a surrogate measure of BMD for the candidate mixture when the two mixtures are determined to be sufficiently similar in dose-response. Guidelines are developed that enable the researcher to evaluate the performance of the proposed similarity measures. Classification Data Reduction Environmental Risk Assessment Equivalence Testing Mixed Models Biostatistics Physical Sciences and Mathematics Statistics and Probability

Search results