1 |
Linear Mixed Model Robust RegressionWaterman, Megan Janet Tuttle 21 May 2002 (has links)
Mixed models are powerful tools for the analysis of clustered data and many extensions of the classical linear mixed model with normally distributed response have been established. As with all parametric models, correctness of the assumed model is critical for the validity of the ensuing inference. Model robust regression techniques predict mean response as a convex combination of a parametric and a nonparametric model fit to the data. It is a semiparametric method by which incompletely or incorrectly specified parametric models can be improved through adding an appropriate amount of a nonparametric fit. We apply this idea of model robustness in the framework of the linear mixed model. The mixed model robust regression (MMRR) predictions we propose are convex combinations of predictions obtained from a standard normal-theory linear mixed model, which serves as the parametric model component, and a locally weighted maximum likelihood fit which serves as the nonparametric component. An application of this technique with real data is provided. / Ph. D.
|
2 |
Accounting for Correlation in the Analysis of Randomized Controlled Trials with Multiple Layers of ClusteringBaumgardner, Adam 17 May 2016 (has links)
A common goal in medical research is to determine the effect that a treatment has on subjects over time. Unfortunately, the analysis of data from such clinical trials often omits several aspects of the study design, leading to incorrect or misleading conclusions. In this paper, a major objective is to show via case studies that randomized controlled trials with longitudinal designs must account for correlation and clustering among observations in order to make proper statistical inference. Further, the effects of outliers in a multi-center, randomized controlled trial with multiple layers of clustering are examined and strategies for detecting and dealing with outlying observations and clusters are discussed. / McAnulty College and Graduate School of Liberal Arts; / Computational Mathematics / MS; / Thesis;
|
3 |
Examination of Mixed-Effects Models with Nonparametrically Generated DataJanuary 2019 (has links)
abstract: Previous research has shown functional mixed-effects models and traditional mixed-effects models perform similarly when recovering mean and individual trajectories (Fine, Suk, & Grimm, 2019). However, Fine et al. (2019) showed traditional mixed-effects models were able to more accurately recover the underlying mean curves compared to functional mixed-effects models. That project generated data following a parametric structure. This paper extended previous work and aimed to compare nonlinear mixed-effects models and functional mixed-effects models on their ability to recover underlying trajectories which were generated from an inherently nonparametric process. This paper introduces readers to nonlinear mixed-effects models and functional mixed-effects models. A simulation study is then presented where the mean and random effects structure of the simulated data were generated using B-splines. The accuracy of recovered curves was examined under various conditions including sample size, number of time points per curve, and measurement design. Results showed the functional mixed-effects models recovered the underlying mean curve more accurately than the nonlinear mixed-effects models. In general, the functional mixed-effects models recovered the underlying individual curves more accurately than the nonlinear mixed-effects models. Progesterone cycle data from Brumback and Rice (1998) were then analyzed to demonstrate the utility of both models. Both models were shown to perform similarly when analyzing the progesterone data. / Dissertation/Thesis / Doctoral Dissertation Psychology 2019
|
4 |
The influence of teat wash failure on milk yield in dairy cowsLilja, Mathias, Keteris Eckerstedt, Ilse January 2016 (has links)
Data for the period 2015-04 to 2015-09 was analyzed in order to examine the possible relationship between teat wash failure and the result on milk yield for dairy cows. Data provided by Sveriges Lantrbruksuniversitet over 49 093 specific milking events were used. Two linear mixed-effects models and one basic OLS-model were estimated. In order to perform the analysis a lot of data manipulation also had to be performed. The data analysis was divided into to two parts. First the variable of interest (teatwash) was examined by constructing two versions of the different models; an unrestricted- and a restricted version were teatwash had been excluded. Because of the large sample and linear mixed-effect models an out-of-sample forecasting method was used as the primary evaluation criteria. The prediction errors were evaluated on the basis of root mean squared error (RMSE) and mean squared error (MSE). The difference between the unrestricted- and restricted models was very small and no indication of a relationship between teat wash failure and milk yield was found. The second part involved the comparison of prediction errors between the two mixed-effect models and the OLS-model. Surprisingly, the basic OLS-model resulted in the lowest prediction error although obvious breach of assumptions.
|
5 |
Predicting the Winner of the EURO 2008. A statistical investigation of bookmakers odds.Leitner, Christoph, Zeileis, Achim, Hornik, Kurt January 2008 (has links) (PDF)
In June 2008 one of the biggest and most popular sports tournaments took place in Austria and Switzerland, the European football championship 2008 (UEFA EURO 2008). Before the tournament started millions of football supporters throughout the world were asking themselves, just as we did: "Who is going to win the EURO 2008?". We investigate a method for forecasting the tournament outcome, that is not based on historical data (such as scores in previous matches) but on quoted winning odds for each of the 16 teams as provided by 45 international bookmakers. By using a mixed-effects model with a team-specific random effect and fixed effects for the bookmaker and the preliminary group we model the unknown "true" log-odds for winning the championship. The final of the EURO 2008 was played by the teams Germany and Spain. This was exactly the fixture that our method forecasted with a probability of about 20.2%. Furthermore, estimated winning probabilities can be derived from our model, where team Germany, the runner-up of the final had the highest probability (17.6%) to win the title and team Spain the winner of the tournament had the second best chance to win the championship (12.3%). To adjust for effects of the tournament schedule including the group draw, we recovered the latent team strength (underlying the bookmakers' expectations) to answer the question: Will the "best" team win? An ex post analysis of the tournament showed that our method yields good predictions of the tournament outcome and outperforms the FIFA/Coca Cola World rating and the Elo rating. / Series: Research Report Series / Department of Statistics and Mathematics
|
6 |
Incorporating chromatin interaction data to improve prediction accuracy of gene expressionLi, Xue 30 April 2015 (has links)
Genome structure can be classified into three categories: primary structure, secondary structure and tertiary structure, and they are all important for gene transcription regulation. In this research, we utilize the structural information to characterize the correlations and interactions among genes, and involve such information into the Linear Mixed-Effects (LME) model to improve the accuracy of gene expression prediction. In particular, we use chromatin features as predictors and each gene is an observation. Before model training and testing, genes are grouped according to the genome structural information. We use four gene grouping methods: 1) grouping genes according to sliding windows on primary structure; 2) grouping anchor genes in chromatin loop structure; 3) grouping genes in the CTCF-anchored domain; and 4) grouping genes in the chromatin domains obtained from Hi-C experiments. We compare the prediction accuracy between LME model and linear regression model. If all chromatin feature predictors are included into the models, based on the primary structure only (Method 1), the LME models improve prediction accuracy by up to 1%. Based on the tertiary structure only (Methods 2-4), for the genes that can be grouped according the tertiary interaction data, LME models improve prediction accuracy by up to 2.1%. For individual chromatin feature predictors, the LME models improve from 2% to 26 %, in which improvement is more significant for chromatin features that have lower original predictive ability. For future research we propose a model that combines the primary and tertiary structure to infer the correlations among genes to further improve the prediction.
|
7 |
Informational efficiency of the real estate market: A meta-analysisHerath, Shanaka, Maier, Gunther 16 April 2015 (has links) (PDF)
The growing empirical literature testing informational efficiency of real estate markets uses data from various contexts and at different levels of aggregation. The results of these studies are mixed. We use a distinctive meta-analysis to examine whether some of these study characteristics and contexts lead to a significantly higher chance for identification of an efficient real estate market. The results generated through meta-regression suggest that use of stock market data and individual level data, rather than aggregate data, significantly improves the probability of a study concluding efficiency. Additionally, the findings neither provide support for the suspicion that the view of market efficiency has significantly changed over the years nor do they indicate a publication bias resulting from such a view. The statistical insignificance of other study characteristics suggests that the outcome concerning efficiency is a context-specific random manifestation for the most part. (authors' abstract)
|
8 |
Estimating the above-ground biomass of mangrove forests in KenyaCohen, Rachel January 2014 (has links)
Robust estimates of forest above-ground biomass (AGB) are needed in order to constrain the uncertainty in regional and global carbon budgets, predictions of global climate change and remote sensing efforts to monitor large scale changes in forest cover and biomass. Estimates of AGB and their associated uncertainty are also essential for international forest-based climate change mitigation strategies such as REDD+. Mangrove forests are widely recognised as globally important carbon stores. Continuing high rates of global mangrove deforestation represent a loss of future carbon sequestration potential and could result in significant release into the atmosphere of the carbon currently being stored within mangroves. The main aims of this thesis are 1) to provide information on the current AGB stocks of mangrove forests in Kenya at spatial scales relevant for climate change research, forest management and REDD+ and 2) to evaluate and constrain the uncertainty associated with these AGB estimates. This thesis adopted both a ground-based statistical approach and a remote sensing based approach to estimating mangrove AGB in Kenya. Allometric equations were developed for Kenyan mangroves using mixed-effects regression analysis and uncertainties were fully propagated (using a Monte Carlo based approach) to estimates of AGB at all spatial scales (tree, plot, region and landscape). In this study, species and site effects accounted for a large proportion (41%) of the total variability in mangrove AGB. The generic biomass equation produced for Kenyan mangroves has the potential for broad application as it can be used to estimate the AGB of new trees where there is no pre-existing knowledge of the specific species-site allometric relationship. The 95% prediction intervals for landscape scale estimates of total AGB suggest that between 5.4 and 7.2 megatonnes (Mt) of AGB is currently held in Kenyan mangrove forests. An in-depth evaluation of the relative contribution of various components of uncertainty (measurement, parameter and residual uncertainty) to the magnitude of the total uncertainty of AGB estimates was carried out. This evaluation was undertaken using both the mixed-effects regression model and a standard ordinary least squares (OLS) regression model. The exclusion of measurement uncertainty during the biomass estimation process had negligible impact on the magnitude of the uncertainty regardless of spatial scale or tree size. Excluding the uncertainty due to species and site effects (from the mixed-effects model) consistently resulted in a large reduction (~ 70%) in the overall uncertainty. Estimates of the uncertainty produced by the OLS model were unrealistically low which is illustrative of the general need to account for group effects in biomass regression models. L-band Synthetic Aperture Radar (SAR) was used to estimate the AGB of Kenyan mangroves. There was an observable relationship (R2 = 0.45) between L-band HH and AGB with HH backscatter found to decrease as a function of increasing AGB. There was no significant relationship found between L-band HV and AGB. The negative relationship between HH and AGB in this study can possibly be attributed to enhanced backscatter at lower AGB due to strong double-bounce and direct surface scattering from short stature/open forests and attenuation of the SAR signal at higher AGB. The SAR-derived estimate of total AGB for Kenyan mangroves was 5.32 Mt ± 18.6%. However, due to the unexpected nature of the HH-AGB relationship found in this study the SAR-derived estimates of mangrove AGB in this study should be considered with caution.
|
9 |
Bayesian Variable Selection for Logistic Models Using Auxiliary Mixture SamplingTüchler, Regina January 2006 (has links) (PDF)
The paper presents an Markov Chain Monte Carlo algorithm for both variable and covariance selection in the context of logistic mixed effects models. This algorithm allows us to sample solely from standard densities, with no additional tuning being needed. We apply a stochastic search variable approach to select explanatory variables as well as to determine the structure of the random effects covariance matrix. For logistic mixed effects models prior determination of explanatory variables and random effects is no longer prerequisite since the definite structure is chosen in a data-driven manner in the course of the modeling procedure. As an illustration two real-data examples from finance and tourism studies are given. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
10 |
Nonlinear mixed effects models for longitudinal DATAMahbouba, Raid January 2015 (has links)
The main objectives of this master thesis are to explore the effectiveness of nonlinear mixed effects model for longitudinal data. Mixed effect models allow to investigate the nature of relationship between the time-varying covariates and the response while also capturing the variations of subjects. I investigate the robustness of the longitudinal models by building up the complexity of the models starting from multiple linear models and ending up with additive nonlinear mixed models. I use a dataset where firms’ leverage are explained by four explanatory variables in addition to a grouping factor that is the firm factor. The models are compared using comparison statistics such as AIC, BIC and by a visual inspection of residuals. Likelihood ratio test has been used in some nested models only. The models are estimated by maximum likelihood and restricted maximum likelihood estimation. The most efficient model is the nonlinear mixed effects model which has lowest AIC and BIC. The multiple linear regression model failed to explain the relation and produced unrealistic statistics
|
Page generated in 0.0562 seconds