Global ETD Search

51	Item and person parameter estimation using hierarchical generalized linear models and polytomous item response theory models Williams, Natasha Jayne. January 2003 (has links) Thesis (Ph. D.)--University of Texas at Austin, 2003. / Vita. Includes bibliographical references. Available also from UMI Company.
52	Modeling and forecast of Brazilian reservoir inflows via dynamic linear models under climate change scenarios Lima, Luana Medeiros Marangon 06 February 2012 (has links) The hydrothermal scheduling problem aims to determine an operation strategy that produces generation targets for each power plant at each stage of the planning horizon. This strategy aims to minimize the expected value of the operation cost over the planning horizon, composed of fuel costs to operate thermal plants plus penalties for failure in load supply. The system state at each stage is highly dependent on the water inflow at each hydropower generator reservoir. This work focuses on developing a probabilistic model for the inflows that is suitable for a multistage stochastic algorithm that solves the hydrothermal scheduling problem. The probabilistic model that governs the inflows is based on a dynamic linear model. Due to the cyclical behavior of the inflows, the model incorporates seasonal and regression components. We also incorporate climate variables such as precipitation, El Ni\~no, and other ocean indexes, as predictive variables when relevant. The model is tested for the power generation system in Brazil with about 140 hydro plants, which are responsible for more than 80\% of the electricity generation in the country. At first, these plants are gathered by basin and classified into 15 groups. Each group has a different probabilistic model that describes its seasonality and specific characteristics. The inflow forecast derived with the probabilistic model at each stage of the planning horizon is a continuous distribution, instead of a single point forecast. We describe an algorithm to form a finite scenario tree by sampling from the inflow forecasting distribution with interstage dependency, that is, the inflow realization at a specific stage depends on the inflow realization of previous stages. / text Reservoir inflow forecasting Dynamic linear models Climate predictors
53	Examining the invariance of item and person parameters estimated from multilevel measurement models when distribution of person abilities are non-normal Moyer, Eric 24 September 2013 (has links) Multilevel measurement models (MMM), an application of hierarchical generalized linear models (HGLM), model the relationship between ability levels estimates and item difficulty parameters, based on examinee responses to items. A benefit of using MMM is the ability to include additional levels in the model to represent a nested data structure, which is common in educational contexts, by using the multilevel framework. Previous research has demonstrated the ability of the one-parameter MMM to accurately recover both item difficulty parameters and examinee ability levels, when using both 2- and 3-level models, under various sample size and test length conditions (Kamata, 1999; Brune, 2011). Parameter invariance of measurement models, that parameter estimates are equivalent regardless of the distribution of the ability levels, is important when the typical assumption of a normal distribution of ability levels in the population may not be correct. An assumption of MMM is that the distribution of examinee abilities, which is represented by the level-2 residuals in the HGLM, is normal. If the distribution of abilities in the population are not normal, as suggested by Micceri (1989), this assumption of MMM is violated, which has been shown to affect the estimation of the level-2 residuals. The current study investigated the parameter invariance of the 2-level 1P-MMM, by examining the accuracy of item difficulty parameter estimates and examinee ability level estimates. Study conditions included the standard normal distribution, as a baseline, and three non-normal distributions having various degrees of skew, in addition to various test lengths and sample sizes, to simulate various testing conditions. The study's results provide evidence for overall parameter invariance of the 2-level 1P-MMM, when accounting for scale indeterminacy from the estimation process, for the study conditions included. Although, the error in the item difficulty parameter and examinee ability level estimates in the study were not of practical importance, there was some evidence that ability distributions may affect the accuracy of parameter estimates for items with difficulties greater than represented in this study. Also, the accuracy of abilities estimates for non-normal distributions seemed less for conditions with greater test lengths and sample sizes, indicating possible increased difficulty in estimating abilities from non-normal distributions. / text Multilevel measurement models Hierarchical generalized linear models Psychometrics
54	Item and person parameter estimation using hierarchical generalized linear models and polytomous item response theory models Williams, Natasha Jayne 27 July 2011 (has links) Not available / text Parameter estimation Linear models (Statistics)
55	Analyzing the Behavior of Rats by Repeated Measurements Hall, Kenita A 03 May 2007 (has links) Longitudinal data, which is also known as repeated measures, has grown increasingly within the past years because of its ability to monitor change both within and between subjects. Statisticians in many fields of study have chosen this way of collecting data because it is cost effective and it minimizes the number of subjects required to produce a meaningful outcome. This thesis will explore the world of longitudinal studies to gain a thorough understanding of why this type of collecting data has grown so rapidly. This study will also describe several methods to analyze repeated measures using data collected on the behavior of both adolescent and adult rats. The question of interest is to see if there is a change in the mean response over time and if the covariates (age, bodyweight, gender, and time) influence those changes. After much testing, our data set has a positive nonlinear change in the mean response over time within the age and gender groups. Using a model that included random effects proved to be a better method than models that did not use any random effects. Taking the log of the response variable and using day as the random effect was overall a better fit for our dataset. The transformed model also showed all covariates except for age as being significant. Longitudinal Data Repeated Measurements Mixed Models Non-Linear Models Mathematics
56	Optimal designs for linear mixed models. Debusho, Legesse Kassa. January 2004 (has links) The research of this thesis deals with the derivation of optimum designs for linear mixed models. The problem of constructing optimal designs for linear mixed models is very broad. Thus the thesis is mainly focused on the design theory for random coefficient regression models which are a special case of the linear mixed model. Specifically, the major objective of the thesis is to construct optimal designs for the simple linear and the quadratic regression models with a random intercept algebraically. A second objective is to investigate the nature of optimal designs for the simple linear random coefficient regression model numerically. In all models time is considered as an explanatory variable and its values are assumed to belong the set {a, 1, ... , k}. Two sets of individual designs, designs with non-repeated time points comprising up to k + 1 distinct time points and designs with repeated time points comprising up to k + 1 time points not necessarily distinct, are used in the thesis. In the first case there are 2k+ - 1 individual designs while in the second case there are ( 2 2k k+ 1 ) - 1 such designs. The problems of constructing population designs, which allocate weights to the individual designs in such a way that the information associated with the model parameters is in some sense maximized and the variances associated with the mean responses at a given vector of time points are in some sense minimized, are addressed. In particular D- and V-optimal designs are discussed. A geometric approach is introduced to confirm the global optimality of D- and V-optimal designs for the simple linear regression model with a random intercept. It is shown that for the simple linear regression model with a random intercept these optimal designs are robust to the choice of the variance ratio. A comparison of these optimal designs over the sets of individual designs with repeated and non-repeated points for that model is also made and indicates that the D- and V-optimal iii population designs based on the individual designs with repeated points are more efficient than the corresponding optimal population designs with non-repeated points. Except for the one-point case, D- and V-optimal population designs change with the values of the variance ratio for the quadratic regression model with a random intercept. Further numerical results show that the D-optimal designs for the random coefficient models are dependent on the choice of variance components. / Thesis (Ph.D.) - University of KwaZulu-Natal, Pietermaritzburg, 2004. Regression Analysis. Linear Models (Statistics) Random Variables. Theses--Mathematics.
57	Use of statistical modelling and analyses of malaria rapid diagnostic test outcome in Ethiopia. Ayele, Dawit Getnet. 12 December 2013 (has links) The transmission of malaria is among the leading public health problems in Ethiopia. From the total area of Ethiopia, more than 75% is malarious. Identifying the infectiousness of malaria by socio-economic, demographic and geographic risk factors based on the malaria rapid diagnosis test (RDT) survey results has several advantages for planning, monitoring and controlling, and eventual malaria eradication effort. Such a study requires thorough understanding of the diseases process and associated factors. However such studies are limited. Therefore, the aim of this study was to use different statistical tools suitable to identify socioeconomic, demographic and geographic risk factors of malaria based on the malaria rapid diagnosis test (RDT) survey results in Ethiopia. A total of 224 clusters of about 25 households were selected from the Amhara, Oromiya and Southern Nation Nationalities and People (SNNP) regions of Ethiopia. Accordingly, a number of binary response statistical analysis models were used. Multiple correspondence analysis was carried out to identify the association among socioeconomic, demographic and geographic factors. Moreover a number of binary response models such as survey logistic, GLMM, GLMM with spatial correlation, joint models and semi-parametric models were applied. To test and investigate how well the observed malaria RDT result, use of mosquito nets and use of indoor residual spray data fit the expectations of the model, Rasch model was used. The fitted models have their own strengths and weaknesses. Application of these models was carried out by analysing data on malaria RDT result. The data used in this study, which was conducted from December 2006 to January 2007 by The Carter Center, is from baseline malaria indicator survey in Amhara, Oromiya and Southern Nation Nationalities and People (SNNP) regions of Ethiopia. The correspondence analysis and survey logistic regression model was used to identify predictors which affect malaria RDT results. The effect of identified socioeconomic, demographic and geographic factors were subsequently explored by fitting a generalized linear mixed model (GLMM), i.e., to assess the covariance structures of the random components (to assess the association structure of the data). To examine whether the data displayed any spatial autocorrelation, i.e., whether surveys that are near in space have malaria prevalence or incidence that is similar to the surveys that are far apart, spatial statistics analysis was performed. This was done by introducing spatial autocorrelation structure in GLMM. Moreover, the customary two variables joint modelling approach was extended to three variables joint effect by exploring the joint effect of malaria RDT result, use of mosquito nets and indoor residual spray in the last twelve months. Assessing the association between these outcomes was also of interest. Furthermore, the relationships between the response and some confounding covariates may have unknown functional form. This led to proposing the use of semiparametric additive models which are less restrictive in their specification. Therefore, generalized additive mixed models were used to model the effect of age, family size, number of rooms per person, number of nets per person, altitude and number of months the room sprayed nonparametrically. The result from the study suggests that with the correct use of mosquito nets, indoor residual spraying and other preventative measures, coupled with factors such as the number of rooms in a house, are associated with a decrease in the incidence of malaria as determined by the RDT. However, the study also suggests that the poor are less likely to use these preventative measures to effectively counteract the spread of malaria. In order to determine whether or not the limited number of respondents had undue influence on the malaria RDT result, a Rasch model was used. The result shows that none of the responses had such influences. Therefore, application of the Rasch model has supported the viability of the total sixteen (socio-economic, demographic and geographic) items for measuring malaria RDT result, use of indoor residual spray and use of mosquito nets. From the analysis it can be seen that the scale shows high reliability. Hence, the result from Rasch model supports the analysis carried out in previous models. / Thesis (Ph.D.)-University of KwaZulu-Natal, Pietermaritzburg, 2013. Mathematical statistics. Probabilities. Linear models (Statistics)
58	Statistical modelling of availability of major food cereals in Lesotho : application of regression models and diagnostics. Khoeli, Makhala Bernice. January 2012 (has links) Oftentimes, application of regression models to analyse cereals data is limited to estimating and predicting crop production or yield. The general approach has been to fit the model without much consideration of the problems that accompany application of regression models to real life data, such as collinearity, models not fitting the data correctly and violation of assumptions. These problems may interfere with applicability and usefulness of the models, and compromise validity of results if they are not corrected when fitting the model. We applied regression models and diagnostics on national and household data to model availability of main cereals in Lesotho, namely, maize, sorghum and wheat. The application includes the linear regression model, regression and collinear diagnostics, Box-Cox transformation, ridge regression, quantile regression, logistic regression and its extensions with multiple nominal and ordinal responses. The Linear model with first-order autoregressive process AR(1) was used to determine factors that affected availability of cereals at the national level. Case deletion diagnostics were used to identify extreme observations with influence on different quantities of the fitted regression model, such as estimated parameters, predicted values, and covariance matrix of the estimates. Collinearity diagnostics detected the presence of more than one collinear relationship coexisting in the data set. They also determined variables involved in each relationship, and assessed potential negative impact of collinearity on estimated parameters. Ridge regression remedied collinearity problems by controlling inflation and instability of estimates. The Box-Cox transformation corrected non-constant variance, longer and heavier tails of the distribution of data. These increased applicability and usefulness of the linear models in modeling availability of cereals. Quantile regression, as a robust regression, was applied to the household data as an alternative to classical regression. Classical regression estimates from ordinary least squares method are sensitive to distributions with longer and heavier tails than the normal distribution, as well as to outliers. Quantile regression estimates appear to be more efficient than least squares estimates for a wide range of error term distribution. We studied availability of cereals further by categorizing households according to availability of different cereals, and applied the logistic regression model and its extensions. Logistic regression was applied to model availability and non-availability of cereals. Multinomial logistic regression was applied to model availability with nominal multiple categories. Ordinal logistic regression was applied to model availability with ordinal categories and this made full use of available information. The three variants of logistic regression model gave results that are in agreement, which are also in agreement with the results from the linear regression model and quantile regression model. / Thesis (Ph.D.)-University of KwaZulu-Natal, Durban, 2012. Mathematical statistics. Probabilities. Linear models (Statistics)
59	A comparison of Bayesian variable selection approaches for linear models Rahman, Husneara 03 May 2014 (has links) Bayesian variable selection approaches are more powerful in discriminating among models regardless of whether these models under investigation are hierarchical or not. Although Bayesian approaches require complex computation, use of theMarkov Chain Monte Carlo (MCMC) methods, such as, Gibbs sampler and Metropolis-Hastings algorithm make computations easier. In this study we investigated the e↵ectiveness of Bayesian variable selection approaches in comparison to other non-Bayesian or classical approaches. For this purpose, we compared the performance of Bayesian versus non-Bayesian variable selection approaches for linear models. Among these approaches, we studied Conditional Predictive Ordinate (CPO) and Bayes factor. Among the non-Bayesian or classical approaches, we implemented adjusted R-square, Akaike Information Criterion (AIC) and Bayes Information Criterion (BIC) for model selection. We performed a simulation study to examine how Bayesian and non- Bayesian approaches perform in selecting variables. We also applied these methods to real data and compared their performances. We observed that for linear models, Bayesian variable selection approaches perform consistently as that of non-Bayesian approaches. / Bayesian inference -- Bayesian inference for normally distributed likekilhood -- Model adequacy -- Simulation approach -- Application to wage data. / Department of Mathematical Sciences Bayesian statistical decision theory Linear models (Statistics) Regression analysis
60	An investigation into the use of combined linear and neural network models for time series data / A.S. Kruger. Kruger, Albertus Stephanus January 2009 (has links) Time series forecasting is an important area of forecasting in which past observations of the same variable are collected and analyzed to develop a model describing the underlying relationship. The model is then used to extrapolate the time series into the future. This modeling approach is particularly useful when little knowledge is available on the underlying data generating process or when there is no satisfactory explanatory model that relates the prediction variable to other explanatory variables. Time series can be modeled in a variety of ways e.g. using exponential smoothing techniques, regression models, autoregressive (AR) techniques, moving averages (MA) etc. Recent research activities in forecasting also suggested that artificial neural networks can be used as an alternative to traditional linear forecasting models. This study will, along the lines of an existing study in the literature, investigate the use of a hybrid approach to time series forecasting using both linear and neural network models. The proposed methodology consists of two basic steps. In the first step, a linear model is used to analyze the linear part of the problem and in the second step a neural network model is developed to model the residuals from the linear model. The results from the neural network can then be used to predict the error terms for the linear model. This means that the combined forecast of the time series will depend on both models. Following an overview of the models, empirical tests on real world data will be performed to determine the forecasting performance of such a hybrid model. Results have indicated that depending on the forecasting period, it might be worthwhile to consider the use of a hybrid model. / Thesis (M.Sc. (Computer Science))--North-West University, Vaal Triangle Campus, 2010. Time series Forecasting Linear models Neural networks Hybrid models

Search results