Global ETD Search

241	Semiparametric Regression Methods with Covariate Measurement Error Johnson, Nels Gordon 06 December 2012 (has links) In public health, biomedical, epidemiological, and other applications, data collected are often measured with error. When mismeasured data is used in a regression analysis, not accounting for the measurement error can lead to incorrect inference about the relationships between the covariates and the response. We investigate measurement error in the covariates of two types of regression models. For each we propose a fully Bayesian approach that treats the variable measured with error as a latent variable to be integrated over, and a semi-Bayesian approach which uses a first order Laplace approximation to marginalize the variable measured with error out of the likelihood. The first model is the matched case-control study for analyzing clustered binary outcomes. We develop low-rank thin plate splines for the case where a variable measured with error has an unknown, nonlinear relationship with the response. In addition to the semi- and fully Bayesian approaches, we propose another using expectation-maximization to detect both parametric and nonparametric relationships between the covariates and the binary outcome. We assess the performance of each method via simulation terms of mean squared error and mean bias. We illustrate each method on a perturbed example of 1--4 matched case-control study. The second regression model is the generalized linear model (GLM) with unknown link function. Usually, the link function is chosen by the user based on the distribution of the response variable, often to be the canonical link. However, when covariates are measured with error, incorrect inference as a result of the error can be compounded by incorrect choice of link function. We assess performance via simulation of the semi- and fully Bayesian methods in terms of mean squared error. We illustrate each method on the Framingham Heart Study dataset. The simulation results for both regression models support that the fully Bayesian approach is at least as good as the semi-Bayesian approach for adjusting for measurement error, particularly when the distribution of the variable of measure with error and the distribution of the measurement error are misspecified. / Ph. D. Bayesian methods error-in-covariates generalized linear models matched case-control studies mixed models semiparametric reg
242	Long-term Benefits of Extracurricular Activities on Socioeconomic Outcomes and Their Trends in 1988-2012 Long, Thomas Carl 09 November 2015 (has links) Across the country, budget cuts to education have resulted in decreased funds available for extracurricular activities. This trend in policy may have a significant impact on future outcomes, as reflected in student success measures. Using two datasets that were collected over the last two decades, in the present study, the researcher assessed the relationship between participation in extracurricular activities and the future socioeconomic outcomes in respondents' lives, including post-secondary education, full-time employment status, and income. Two existing large-scale longitudinal studies of the U.S. secondary students, i.e., the National Education Longitudinal Study of 1988 (NELS: 88) and the Education Longitudinal Study of 2002 (ELS: 2002), served as data sources. As these surveys were conducted about a decade apart, the information they yielded was suitable for meeting the study aims. Generalized linear models, such as multiple regression and logistic regression analyses, by applying sample weights, were performed to examine the impacts of extracurricular activity participation on the aforementioned outcome measures. The implications of the study findings, including the comparison of the results from two different datasets collected at different time points, were interpreted with respect to school budget policy. Results from the NELS: 88 and ELS: 2002 were also compared to evaluate the trends in the characteristics and performance of U.S. high school students during the 1988-2012 period. / Ph. D. Extracurricular Activities Socioeconomic Outcomes NELS: 88 ELS: 2002 Generalized Linear Models
243	Randomization analysis of experimental designs under non standard conditions Morris, David Dry January 1987 (has links) Often the basic assumptions of the ANOVA for an experimental design are not met or the statistical model is incorrectly specified. Randomization of treatments to experimental units is expected to protect against such shortcomings. This paper uses randomization theory to examine the impact on the expectations of mean squares, treatment means, and treatment differences for two model mis·specifications: Systematic response shifts and correlated experimental units. Systematic response shifts are presented in the context of the randomized complete block design (RCBD). In particular fixed shifts are added to the responses of experimental units in the initial and final positions of each block. The fixed shifts are called border shifts. It is shown that the RCBD is an unbiased design under randomization theory when border shifts are present. Treatment means are biased but treatment differences are unbiased. However the estimate of error is biased upwards and the power of the F test is reduced. Alternative designs to the RCBD under border shifts are the Latin square, semi-Latin square, and two-column designs. Randomization analysis demonstrates that the Latin square is an unbiased design with an unbiased estimate of error and of treatment differences. The semi-Latin square has each of the t treatments occurring only once per row and column, but t is a multiple of the number of rows or columns. Thus each row-column combination contains more than one experimental unit. The semi-Latin square is a biased design with a biased estimate of error even when no border shifts are present. Row-column interaction is responsible for the bias. Border shifts do not contaminate the expected mean squares or treatment differences, and thus the semi-Latin square is a viable alternative when the border shift overwhelms the row-column interaction. The two columns of the two-column design correspond to the border and interior experimental units respectively. Results similar to that for the semi-Latin square are obtained. Simulation studies for the RCBD and its alternatives indicate that the power of the F test is reduced for the RCBD when border shifts are present. When no row-column interaction is present, the semi-Latin square and two-column designs provide good alternatives to the RCBD. Similar results are found for the split plot design when border shifts occur in the sub plots. A main effects plan is presented for situations when the number of whole plot units equals the number of sub plot units per whole plot. The analysis of designs in which the experimental units occur in a sequence and exhibit correlation is considered next. The Williams Type Il(a) design is examined in conjunction with the usual ANOVA and with the method of first differencing. Expected mean squares, treatment means, and treatment differences are obtained under randomization theory for each analysis. When only adjacent experimental units have non negligible correlation, the Type Il(a) design provides an unbiased error estimate for the usual ANOVA. However the expectation of the treatment mean square is biased downwards for a positive correlation. First differencing results in a biased test and a biased error estimate. The test is approximately unbiased if the correlation between units is close to a half. / Ph. D. LD5655.V856 1987.M67 Analysis of variance Linear models (Statistics) Mathematical statistics
244	Hypothesis testing procedures for non-nested regression models Bauer, Laura L. January 1987 (has links) Theory often indicates that a given response variable should be a function of certain explanatory variables yet fails to provide meaningful information as to the specific form of this function. To test the validity of a given functional form with sensitivity toward the feasible alternatives, a procedure is needed for comparing non-nested families of hypotheses. Two hypothesized models are said to be non-nested when one model is neither a restricted case nor a limiting approximation of the other. These non-nested hypotheses cannot be tested using conventional likelihood ratio procedures. In recent years, however, several new approaches have been developed for testing non-nested regression models. A comprehensive review of the procedures for the case of two linear regression models was presented. Comparisons between these procedures were made on the basis of asymptotic distributional properties, simulated finite sample performance and computational ease. A modification to the Fisher and McAleer JA-test was proposed and its properties investigated. As a compromise between the JA-test and the Orthodox F-test, it was shown to have an exact non-null distribution. Its properties, both analytically and empirically derived, exhibited the practical worth of such an adjustment. A Monte Carlo study of the testing procedures involving non-nested linear regression models in small sample situations (n ≤ 40) provided information necessary for the formulation of practical guidelines. It was evident that the modified Cox procedure, N̄ , was most powerful for providing correct inferences. In addition, there was strong evidence to support the use of the adjusted J-test (AJ) (Davidson and MacKinnon's test with small-sample modifications due to Godfrey and Pesaran), the modified JA-test (NJ) and the Orthodox F-test for supplemental information. Under non normal disturbances, similar results were yielded. An empirical study of spending patterns for household food consumption provided a practical application of the non-nested procedures in a large sample setting. The study provided not only an example of non-nested testing situations but also the opportunity to draw sound inferences from the test results. / Ph. D. LD5655.V856 1987.B383 Linear models (Statistics) Statistical hypothesis testing Econometrics
245	Superscalar Processor Models Using Statistical Learning Joseph, P J 04 1900 (has links) Processor architectures are becoming increasingly complex and hence architects have to evaluate a large design space consisting of several parameters, each with a number of potential settings. In order to assist in guiding design decisions we develop simple and accurate models of the superscalar processor design space using a detailed and validated superscalar processor simulator. Firstly, we obtain precise estimates of all signiﬁcant micro-architectural parameters and their interactions by building linear regression models using simulation based experiments. We obtain good approximate models at low simulation costs using an iterative process in which Akaike’s Information Criteria is used to extract a good linear model from a small set of simulations, and limited further simulation is guided by the model using D-optimal experimental designs. The iterative process is repeated until desired error bounds are achieved. We use this procedure for model construction and show that it provides a cost effective scheme to experiment with all relevant parameters. We also obtain accurate predictors of the processors performance response across the entire design-space, by constructing radial basis function networks from sampled simulation experiments. We construct these models, by simulating at limited design points selected by latin hypercube sampling, and then deriving the radial neural networks from the results. We show that these predictors provide accurate approximations to the simulator’s performance response, and hence provide a cheap alternative to simulation while searching for optimal processor design points. Supercomputers Supercomputers - Statistical Methods MATLAB Linear Regression Models Superscalar Processor Architecture Superscalar Processors - Linear Models Radial Basis Function Networks Linear Models RBF Networks Processor Performance Analysis Predictive Performance Model Predictive Modeling Computer Science
246	Statistical Methods for Dating Collections of Historical Documents Tilahun, Gelila 31 August 2011 (has links) The problem in this thesis was originally motivated by problems presented with documents of Early England Data Set (DEEDS). The central problem with these medieval documents is the lack of methods to assign accurate dates to those documents which bear no date. With the problems of the DEEDS documents in mind, we present two methods to impute missing features of texts. In the first method, we suggest a new class of metrics for measuring distances between texts. We then show how to combine the distances between the texts using statistical smoothing. This method can be adapted to settings where the features of the texts are ordered or unordered categoricals (as in the case of, for example, authorship assignment problems). In the second method, we estimate the probability of occurrences of words in texts using nonparametric regression techniques of local polynomial fitting with kernel weight to generalized linear models. We combine the estimated probability of occurrences of words of a text to estimate the probability of occurrence of a text as a function of its feature -- the feature in this case being the date in which the text is written. The application and results of our methods to the DEEDS documents are presented. Kernel Dating Documents Shingle Correspondence distance Smoothing Generalized linear models Logistics regression Local polynomial regression 0581 0463 0800
247	Nonlinearity In Exchange Rates : Evidence From African Economies Jobe, Ndey Isatou January 2016 (has links) In an effort to assess the predictive ability of exchange rate models when data on African countries is sampled, this paper studies nonlinear modelling and prediction of the nominal exchange rate series of the United States dollar to currencies of thirty-eight African states using the smooth transition autoregressive (STAR) model. A three step analysis is undertaken. One, it investigates nonlinearity in all nominal exchange rate series examined using a chain of credible statistical in-sample tests. Significantly, evidence of nonlinear exponential STAR (ESTAR) dynamics is detected across all series. Two, linear models are provided another chance to make it right by shuffling to data on African countries to investigate their predictive power against the tough random walk without drift model. Linear models again failed significantly. Lastly, the predictive ability of nonlinear models against both the random walk without drift and the corresponding linear models is investigated. Nonlinear models display useful forecasting gains over all contending models. Nominal Exchange Rates Linear Models Random Walk Model Smooth Transition Autoregressive Model Linearity Tests Unit Root Tests Forecast Evaluation.
248	Zobecněné lineární modely v upisovacím riziku / Generalized Linear Models in Reserving Risk Zboňáková, Lenka January 2015 (has links) In the presented thesis we deal with the generalized linear models framework in a claims reserving problem. Claims reserving in non-life insurance is firstly described and the considered class of models is introduced. Consequently, this branch of stochastic modelling is implemented in the reserving setup. For computation of the risk associated with claims reserving, we need a predictive distribution of future liabilities in order to evaluate risk measures such as Va- lue at Risk and Conditional Value at Risk. Since datasets in non-life insurance commonly consist of a small number of observations and estimation of predictive distributions can be complicated, we adopt a bootstrap method for this purpose. Model fitting, simulations and consequent measuring of the reserving risk are performed within the use of real-life data. Based on this, an analysis of fitted models and their comparison together with graphical outputs is included. 1
249	Modelování četností pojistných událostí / Claims count modeling in insurance Škoda, Štěpán January 2013 (has links) 1 Abstract: The present work investigates techniques of insurence ratemaking accor- ding to the claims counts of policyholders on the basis of information contained in policies. At the beginning, we provide a closer examination of the theory of genera- lized linear models, which have wide range of applications in the field of actuarial modeling. The second chapter presents the basic Poisson regression model as well as some particular verification methods. Specifically, deviance and Wald test could be found here and furthermore also important results for residuals. The third chapter contains information on alternative approaches to modeling the claim frequencies and at the end the GEE method, that can be applied in case of panel data, is de- scribed. The numerical study based on real insurace data in last part of this diploma thesis illustrate's previously described techniques which were obtained with the help of statistical software SAS.
250	Gaining Insight with Recursive Partitioning of Generalized Linear Models Rusch, Thomas, Zeileis, Achim January 2013 (has links) (PDF) Recursive partitioning algorithms separate a feature space into a set of disjoint rectangles. Then, usually, a constant in every partition is fitted. While this is a simple and intuitive approach, it may still lack interpretability as to how a specific relationship between dependent and independent variables may look. Or it may be that a certain model is assumed or of interest and there is a number of candidate variables that may non-linearly give rise to different model parameter values. We present an approach that combines generalized linear models with recursive partitioning that offers enhanced interpretability of classical trees as well as providing an explorative way to assess a candidate variable's in uence on a parametric model. This method conducts recursive partitioning of a generalized linear model by (1) fitting the model to the data set, (2) testing for parameter instability over a set of partitioning variables, (3) splitting the data set with respect to the variable associated with the highest instability. The outcome is a tree where each terminal node is associated with a generalized linear model. We will show the method's versatility and suitability to gain additional insight into the relationship of dependent and independent variables by two examples, modelling voting behaviour and a failure model for debt amortization, and compare it to alternative approaches. AMS 62J99 ; 62P25, 62H30

Search results