• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 158
  • 158
  • 30
  • 7
  • 6
  • 6
  • 5
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • Tagged with
  • 438
  • 438
  • 177
  • 154
  • 148
  • 115
  • 101
  • 70
  • 54
  • 50
  • 40
  • 36
  • 34
  • 33
  • 30
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

Ideology and interests : a hierarchical Bayesian approach to spatial party preferences

Mohanty, Peter Cushner 04 December 2013 (has links)
This paper presents a spatial utility model of support for multiple political parties. The model includes a "valence" term, which I reparameterize to include both party competence and the voters' key sociodemographic concerns. The paper shows how this spatial utility model can be interpreted as a hierarchical model using data from the 2009 European Elections Study. I estimate this model via Bayesian Markov Chain Monte Carlo (MCMC) using a block Gibbs sampler and show that the model can capture broad European-wide trends while allowing for significant amounts of heterogeneity. This approach, however, which assumes a normal dependent variable, is only able to partially reproduce the data generating process. I show that the data generating process can be reproduced more accurately with an ordered probit model. Finally, I discuss trade-offs between parsimony and descriptive richness and other practical challenges that may be encountered when v building models of party support and make recommendations for capturing the best of both approaches. / text
122

Data Augmentation and Dynamic Linear Models

Frühwirth-Schnatter, Sylvia January 1992 (has links) (PDF)
We define a subclass of dynamic linear models with unknown hyperparameters called d-inverse-gamma models. We then approximate the marginal p.d.f.s of the hyperparameter and the state vector by the data augmentation algorithm of Tanner/Wong. We prove that the regularity conditions for convergence hold. A sampling based scheme for practical implementation is discussed. Finally, we illustrate how to obtain an iterative importance sampling estimate of the model likelihood. (author's abstract) / Series: Forschungsberichte / Institut für Statistik
123

Γραμμικά μοντέλα χρονοσειρών και αυτοσυσχέτισης

Γαζή, Σταυρούλα 07 July 2015 (has links)
Ο σκοπός αυτής της μεταπτυχιακής εργασίας είναι διπλός και συγκεκριμένα αφορά στη μελέτη του απλού / γενικευμένου πολλαπλού μοντέλου παλινδρόμησης όταν σε αυτό παραβιάζεται μια από τις συνθήκες των Gauss-Markov και πιο συγκεκριμένα όταν, Cov{ε_i,ε_j }≠0, ∀ i≠j και στην ανάλυση χρονοσειρών. Αρχικά, γίνεται συνοπτική αναφορά στο απλό και στο πολλαπλό γραμμικό μοντέλο παλινδρόμησης, στις ιδιότητες καθώς και στις εκτιμήσεις των συντελεστών παλινδρόμησης. Περιγράφονται οι ιδιότητες των τυχαίων όρων όπως μέση τιμή, διασπορά, συντελεστές συσχέτισης κ.α., εφόσον υπάρχει παραβίαση της ιδιότητας της συνδιασποράς αυτών. Τέλος, περιγράφεται ο έλεγχος για αυτοσυσχέτιση των τυχαίων όρων των Durbin-Watson καθώς και μια ποικιλία διορθωτικών μέτρων με σκοπό την εξάλειψή της. Στο δεύτερο μέρος, αρχικά αναφέρονται βασικές έννοιες της θεωρίας των χρονοσειρών. Στη συνέχεια, γίνεται ανάλυση διαφόρων στάσιμων χρονοσειρών και συγκεκριμένα, ξεκινώντας από το λευκό θόρυβο, παρουσιάζονται οι χρονοσειρές κινητού μέσου (ΜΑ), οι αυτοπαλινδρομικές χρονοσειρές (ΑR), οι χρονοσειρές ARMA, καθώς και η γενική περίπτωση μη στάσιμων χρονοσειρών, των ΑRΙΜΑ χρονοσειρών και παρατίθενται συνοπτικά τα πρώτα στάδια ανάλυσης μιας χρονοσειράς για κάθε μια από τις περιπτώσεις αυτές. Η εργασία αυτή βασίστηκε σε δύο σημαντικά βιβλία διακεκριμένων επιστημόνων, του κ. Γεώργιου Κ. Χρήστου, Εισαγωγή στην Οικονομετρία και στο βιβλίο των John Neter, Michael H. Kutner, Christofer J. Nachtsheim και William Wasserman, Applied Linear Regression Models. / The purpose of this thesis is twofold, namely concerns the study of the simple / generalized multiple regression model when this violated one of the conditions of Gauss-Markov specifically when, Cov {e_i, e_j} ≠ 0, ∀ i ≠ j and time series analysis. Initially, there is a brief reference to the simple and multiple linear regression model, the properties and estimates of regression coefficients. Describe the properties of random terms such as mean, variance, correlation coefficients, etc., if there is a breach of the status of their covariance. Finally, described the test for autocorrelation of random terms of the Durbin-Watson and a variety of corrective measures to eliminate it. In the second part, first mentioned basic concepts of the theory of time series. Then, various stationary time series analyzes and specifically, starting from the white noise, the time series moving average presented (MA), the aftopalindromikes time series (AR) time series ARMA, and the general case of non-stationary time series of ARIMA time series and briefly presents the first analysis steps in a time series for each of these cases. This work was based on two important books of distinguished scientists, Mr. George K. Christou, Introduction to Econometrics, and in the book of John Neter, Michael H. Kutner, Christofer J. Nachtsheim and William Wasserman, Applied Linear Regression Models.
124

MULTIVARIATE MEASURE OF AGREEMENT

Towstopiat, Olga Michael January 1981 (has links)
Reliability issues are always salient as behavioral researchers observe human behavior and classify individuals from criterion-referenced test scores. This has created a need for studies to assess agreement between observers, recording the occurrance of various behaviors, to establish the reliability of their classifications. In addition, there is a need for measuring the consistency of dichotomous and polytomous classifications established from criterion-referenced test scores. The development of several log linear univariate models for measuring agreement has partially met the demand for a probability-based measure of agreement with a directly interpretable meaning. However, multi-variate repeated measures agreement produres are necessary because of the development of complex intrasubject and intersubject research designs. The present investigation developed applications of the log linear, latent class, and weighted least squares procedures for the analysis of multivariate repeated measures designs. These computations tested the model-data fit and calculated the multivariate measure of the magnitude of agreement under the quasi-equiprobability and quasi-independence models. Applications of these computations were illustrated with real and hypothetical observational data. It was demonstrated that employing log linear, latent class, and weighted least squares computations resulted in identical multi-variate model-data fits with equivalent chi-square values. Moreover, the application of these three methodologies also produced identical measures of the degree of agreement at each point in time and for the multivariate average. The multivariate methods that were developed also included procedures for measuring the probability of agreement for a single response classification or subset of classifications from a larger set. In addition, procedures were developed to analyze occurrences of systematic observed disagreement within the multivariate tables. The consistency of dichotomous and polytomous classifications over repeated assessments of the identical examinees was also suggested as a means of conceptualizing criterion-referenced reliability. By applying the univariate and multivariate models described, the reliability of these classifications across repeated testings could be calculated. The procedures utilizing the log linear, latent structure, and weighted least squares concepts for the purpose of measuring agreement have the advantages of (1)yielding a coefficient of agreement that varies between zero and one and measures agreement in terms of the probability that the observers' judgements will agree, as estimated under a quasi-equiprobability or quasi-independence model, (2)correcting for the proportion of "chance" agreement, and (3) providing a directly interpretable coefficient of "no agreement." Thus, these multivariate procedures may be regarded as a more refined psychometric technology for measuring inter-observer agreement and criterion-referenced test reliability.
125

Flexible statistical modeling of deaths by diarrhoea in South Africa.

Mbona, Sizwe Vincent. 17 December 2013 (has links)
The purpose of this study is to investigate and understand data which are grouped into categories. Various statistical methods was studied for categorical binary responses to investigate the causes of death from diarrhoea in South Africa. Data collected included death type, sex, marital status, province of birth, province of death, place of death, province of residence, education status, smoking status and pregnancy status. The objective of this thesis is to investigate which of the above explanatory variables was most affected by diarrhoea in South Africa. To achieve this objective, different sample survey data analysis techniques are investigated. This includes sketching bar graphs and using several statistical methods namely, logistic regression, surveylogistic, generalised linear model, generalised linear mixed model, and generalised additive model. In the selection of the fixed effects, a bar graph is applied to the response variable individual profile graphs. A logistic regression model is used to identify which of the explanatory variables are more affected by diarrhoea. Statistical applications are conducted in SAS (Statistical Analysis Software). Hosmer and Lemeshow (2000) propose a statistic that they show, through simulation, is distributed as chi‐square when there is no replication in any of the subpopulations. Due to the similarity of the Hosmer and Lemeshow test for logistic regression, Parzen and Lipsitz (1999) suggest using 10 risk score groups. Nevertheless, based on simulation results, May and Hosmer (2004) show that, for all samples or samples with a large percentage of censored observations, the test rejects the null hypothesis too often. They suggest that the number of groups be chosen such that G=integer of {maximum of 12 and minimum of 10}. Lemeshow et al. (2004) state that the observations are firstly sorted in increasing order of their estimated event probability. / Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2013.
126

Imputing Genotypes Using Regularized Generalized Linear Regression Models

Griesman, Joshua 14 June 2012 (has links)
As genomic sequencing technologies continue to advance, researchers are furthering their understanding of the relationships between genetic variants and expressed traits (Hirschhorn and Daly, 2005). However, missing data can significantly limit the power of a genetic study. Here, the use of a regularized generalized linear model, denoted GLMNET is proposed to impute missing genotypes. The method aimed to address certain limitations of earlier regression approaches in regards to genotype imputation, particularly multicollinearity among predictors. The performance of GLMNET-based method is compared to the performance of the phase-based method fastPHASE. Two simulation settings were evaluated: a sparse-missing model, and a small-panel expan- sion model. The sparse-missing model simulated a scenario where SNPs were missing in a random fashion across the genome. In the small-panel expansion model, a set of test individuals that were only genotyped at a small subset of the SNPs of the large panel. Each imputation method was tested in the context of two data-sets: Canadian Holstein cattle data and human HapMap CEU data. Although the proposed method was able to perform with high accuracy (>90% in all simulations), fastPHASE per- formed with higher accuracy (>94%). However, the new method, which was coded in R, was able to impute genotypes with better time efficiency than fastPHASE and this could be further improved by optimizing in a compiled language.
127

Analysis of a binary response : an application to entrepreneurship success in South Sudan.

Lugga, James Lemi John Stephen. January 2012 (has links)
Just over half (50:6%) of the population of South Sudan lives on less than one US Dollar a day. Three quarters of the population live below the poverty line (NBS, Poverty Report, 2010). Generally, effective government policy to reduce unemployment and eradicate poverty focuses on stimulating new businesses. Micro and small enterprises (MSEs) are the major source of employment and income for many in under-developed countries. The objective of this study is to identify factors that determine business success and failure in South Sudan. To achieve this objective, generalized linear models, survey logistic models, the generalized linear mixed models and multiple correspondence analysis are used. The data used in this study is generated from the business survey conducted in 2010. The response variable, which is defined as business success or failure was measured by profit and loss in businesses. Fourteen explanatory variables were identified as factors contributing to business success and failure. A main effect model consisting of the fourteen explanatory variables and three interaction effects were fitted to the data. In order to account for the complexity of the survey design, survey logistic and generalized linear mixed models are refitted to the same variables in the main effect model. To confirm the results from the model we used multiple correspondence analysis. / Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2012.
128

Analysis of longitudinal binary data : an application to a disease process.

Ramroop, Shaun. January 2008 (has links)
The analysis of longitudinal binary data can be undertaken using any of the three families of models namely, marginal, random effects and conditional models. Each family of models has its own respective merits and demerits. The models are applied in the analysis of binary longitudinal data for childhood disease data namely the Respiratory Syncytial Virus (RSV) data collected from a study in Kilifi, coastal Kenya. The marginal model was fitted using generalized estimating equations (GEE). The random effects models were fitted using ‘Proc GLIMMIX’ and ‘NLMIXED’ in SAS and then again in Genstat. Because the data is a state transition type of data with the Markovian property the conditional model was used to capture the dependence of the current response to the previous response which is known as the history. The data set has two main complicating issues. Firstly, there is the question of developing a stochastically based probability model for the disease process. In the current work we use direct likelihood and generalized linear modelling (GLM) approaches to estimate important disease parameters. The force of infection and the recovery rate are the key parameters of interest. The findings of the current work are consistent and in agreement with those in White et al. (2003). The aspect of time dependence on the RSV disease is also highlighted in the thesis by fitting monthly piecewise models for both parameters. Secondly, there is the issue of incomplete data in the analysis of longitudinal data. Commonly used methods to analyze incomplete longitudinal data include the well known available case analysis (AC) and last observation carried forward (LOCF). However, these methods rely on strong assumptions such as missing completely at random (MCAR) for AC analysis and unchanging profile after dropout for LOCF analysis. Such assumptions are too strong to generally hold. In recent years, methods of analyzing incomplete longitudinal data have become available with weaker assumptions, such as missing at random (MAR). Thus we make use of multiple imputation via chained equations that require the MAR assumption and maximum likelihood methods that result in the missing data mechanism becoming ignorable as soon as it is MAR. Thus we are faced with the problem of incomplete repeated non–normal data suggesting the use of at least the Generalized Linear Mixed Model (GLMM) to account for natural individual heterogeneity. The comparison of the parameter estimates using the different methods to handle the dropout is strongly emphasized in order to evaluate the advantages of the different methods and approaches. The survival analysis approach was also utilized to model the data due to the presence of multiple events per subject and the time between these events. / Thesis (Ph.D.)-University of KwaZulu-Natal, Pietermarizburg, 2008.
129

MCMC Estimation of Classical and Dynamic Switching and Mixture Models

Frühwirth-Schnatter, Sylvia January 1998 (has links) (PDF)
In the present paper we discuss Bayesian estimation of a very general model class where the distribution of the observations is assumed to depend on a latent mixture or switching variable taking values in a discrete state space. This model class covers e.g. finite mixture modelling, Markov switching autoregressive modelling and dynamic linear models with switching. Joint Bayesian estimation of all latent variables, model parameters and parameters determining the probability law of the switching variable is carried out by a new Markov Chain Monte Carlo method called permutation sampling. Estimation of switching and mixture models is known to be faced with identifiability problems as switching and mixture are identifiable only up to permutations of the indices of the states. For a Bayesian analysis the posterior has to be constrained in such a way that identifiablity constraints are fulfilled. The permutation sampler is designed to sample efficiently from the constrained posterior, by first sampling from the unconstrained posterior - which often can be done in a convenient multimove manner - and then by applying a suitable permutation, if the identifiability constraint is violated. We present simple conditions on the prior which ensure that this method is a valid Markov Chain Monte Carlo method (that is invariance, irreducibility and aperiodicity hold). Three case studies are presented, including finite mixture modelling of fetal lamb data, Markov switching Autoregressive modelling of the U.S. quarterly real GDP data, and modelling the U .S./U.K. real exchange rate by a dynamic linear model with Markov switching heteroscedasticity. (author's abstract) / Series: Forschungsberichte / Institut für Statistik
130

The effects of environment on catch and effort for the commercial fishery of Lake Winnipeg, Canada

Speers, Jeffery Duncan 12 July 2007 (has links)
Environmental factors affect fish distribution and fisher behavior. These factors are seldom included in stock assessment models, resulting in potentially inaccurate fish abundance estimates. This study determined the impact of these factors using the commercial catch rate of sauger (Sander canadensis) and walleye (Sander vitreus) in Lake Winnipeg by: (1) the use of satellite data to monitor turbidity and its impact on catch via simple linear regression and (2) the effect of environment on catch and effort using generalized linear models. No statistically significant relationship was found between catch and turbidity; a result which may be due to small sample sizes, the fish species' examined, and variable turbidity at depth. Decreased effort was correlated with harsh weather and decreased walleye catch. Increased walleye catch was correlated with low temperature and low Red River discharge. Increased sauger catch was correlated with high temperature, high cloud opacity, and average Red River discharge.

Page generated in 0.085 seconds