Global ETD Search

11	Test of Treatment Effect with Zero-Inflated Over-Dispersed Count Data from Randomized Single Factor Experiments Fan, Huihao 12 September 2014 (has links) No description available. Biostatistics over-dispersion count data Poisson distribution zero-inflation
12	Equações de estimação generalizadas com resposta binomial negativa: modelando dados correlacionados de contagem com sobredispersão / Generalized estimating equations with negative binomial responses: modeling correlated count data with overdispersion Oesselmann, Clarissa Cardoso 12 December 2016 (has links) Uma suposição muito comum na análise de modelos de regressão é a de respostas independentes. No entanto, quando trabalhamos com dados longitudinais ou agrupados essa suposição pode não fazer sentido. Para resolver esse problema existem diversas metodologias, e talvez a mais conhecida, no contexto não Gaussiano, é a metodologia de Equações de Estimação Generalizadas (EEGs), que possui similaridades com os Modelos Lineares Generalizados (MLGs). Essas similaridades envolvem a classificação do modelo em torno de distribuições da família exponencial e da especificação de uma função de variância. A única diferença é que nessa função também é inserida uma matriz trabalho que inclui a parametrização da estrutura de correlação dentro das unidades experimentais. O principal objetivo desta dissertação é estudar como esses modelos se comportam em uma situação específica, de dados de contagem com sobredispersão. Quando trabalhamos com MLGs esse problema é resolvido através do ajuste de um modelo com resposta binomial negativa (BN), e a ideia é a mesma para os modelos envolvendo EEGs. Essa dissertação visa rever as teorias existentes em EEGs no geral e para o caso específico quando a resposta marginal é BN, e além disso mostrar como essa metodologia se aplica na prática, com três exemplos diferentes de dados correlacionados com respostas de contagem. / An assumption that is common in the analysis of regression models is that of independent responses. However, when working with longitudinal or grouped data this assumption may not have sense. To solve this problem there are several methods, but perhaps the best known, in the non Gaussian context, is the one based on Generalized Estimating Equations (GEE), which has similarities with Generalized Linear Models (GLM). Such similarities involve the classification of the model around the exponential family and the specification of a variance function. The only diference is that in this function is also inserted a working correlation matrix concerning the correlations within the experimental units. The main objective of this dissertation is to study how these models behave in a specific situation, which is the one on count data with overdispersion. When we work with GLM this kind of problem is solved by setting a model with a negative binomial response (NB), and the idea is the same for the GEE methodology. This dissertation aims to review in general the GEE methodology and for the specific case when the responses follow marginal negative binomial distributions. In addition, we show how this methodology is applied in practice, with three examples of correlated data with count responses. Binomial negativa Count Data Dados de contagem Equações de estimação generalizadas Generalized Estimating Equations Negative Binomial Overdispersion Sobredispersão
13	Novel regression models for discrete response Peluso, Alina January 2017 (has links) In a regression context, the aim is to analyse a response variable of interest conditional to a set of covariates. In many applications the response variable is discrete. Examples include the event of surviving a heart attack, the number of hospitalisation days, the number of times that individuals benefit of a health service, and so on. This thesis advances the methodology and the application of regression models with discrete response. First, we present a difference-in-differences approach to model a binary response in a health policy evaluation framework. In particular, generalized linear mixed methods are employed to model multiple dependent outcomes in order to quantify the effect of an adopted pay-for-performance program while accounting for the heterogeneity of the data at the multiple nested levels. The results show how the policy had a positive effect on the hospitals' quality in terms of those outcomes that can be more influenced by a managerial activity. Next, we focus on regression models for count response variables. In a parametric framework, Poisson regression is the simplest model for count data though it is often found not adequate in real applications, particularly in the presence of excessive zeros and in the case of dispersion, i.e. when the conditional mean is different to the conditional variance. Negative Binomial regression is the standard model for over-dispersed data, but it fails in the presence of under-dispersion. Poisson-Inverse Gaussian regression can be used in the case of over-dispersed data, Generalised-Poisson regression can be employed in the case of under-dispersed data, and Conway-Maxwell Poisson regression can be employed in both cases of over- or under-dispersed data, though the interpretability of these models is ot straightforward and they are often found computationally demanding. While Jittering is the default non-parametric approach for count data, inference has to be made for each individual quantile, separate quantiles may cross and the underlying uniform random sampling can generate instability in the estimation. These features motivate the development of a novel parametric regression model for counts via a Discrete Weibull distribution. This distribution is able to adapt to different types of dispersion relative to Poisson, and it also has the advantage of having a closed form expression for the quantiles. As well as the standard regression model, generalized linear mixed models and generalized additive models are presented via this distribution. Simulated and real data applications with different type of dispersion show a good performance of Discrete Weibull-based regression models compared with existing regression approaches for count data.
14	A Flexible Zero-Inflated Poisson Regression Model Roemmele, Eric S. 01 January 2019 (has links) A practical problem often encountered with observed count data is the presence of excess zeros. Zero-inflation in count data can easily be handled by zero-inflated models, which is a two-component mixture of a point mass at zero and a discrete distribution for the count data. In the presence of predictors, zero-inflated Poisson (ZIP) regression models are, perhaps, the most commonly used. However, the fully parametric ZIP regression model could sometimes be restrictive, especially with respect to the mixing proportions. Taking inspiration from some of the recent literature on semiparametric mixtures of regressions models for flexible mixture modeling, we propose a semiparametric ZIP regression model. We present an "EM-like" algorithm for estimation and a summary of asymptotic properties of the estimators. The proposed semiparametric models are then applied to a data set involving clandestine methamphetamine laboratories and Alzheimer's disease. Bootstrap Count data EM Algorithm zero-inflation semiparametric model Statistical Models Statistical Theory
15	Time series modelling of high frequency stock transaction data Quoreshi, Shahiduzzaman January 2006 (has links) No description available. Economics Count data Intra-day High frequency Time series Estimation Long memory Finance Nationalekonomi
16	Time series modelling of high frequency stock transaction data Quoreshi, Shahiduzzaman January 2006 (has links) No description available. Economics Count data Intra-day High frequency Time series Estimation Long memory Finance Nationalekonomi
17	On modeling telecommuting behavior : option, choice and frequency Singh, Palvinder 18 June 2012 (has links) The current study contributes to the already substantial scholarly literature on telecommuting by estimating a joint model of three dimensions- option, choice and frequency of telecommuting. In doing so, we focus on workers who are not self-employed workers and who have a primary work place that is outside their homes. The unique methodological features of this study include the use of a general and flexible generalized hurdle count model to analyze the precise count of telecommuting days per month, and the formulation and estimation of a model system that embeds the count model within a larger multivariate choice framework. The unique substantive aspects of this study include the consideration of the "option to telecommute" dimension and the consideration of a host of residential neighborhood built environment variables. The 2009 NHTS data is used for the analysis, and allows us to develop a current perspective of the process driving telecommuting decisions. This data set is supplemented with a built environment data base to capture the effects of demographic, work-related, and built environment measures on the telecommuting-related dimensions. In addition to providing important insights for policy analysis, the results in this study indicate that ignoring the "option" dimension of telecommuting can, and generally will, lead to incorrect conclusions regarding the behavioral processes governing telecommuting decisions. The empirical results have implications for transportation planning analysis as well as for the worker recruitment/retention and productivity literature. / text Telecommuting Work-life balance Count data Generalized ordered response model Multivariate modeling
18	A count data model with endogenous covariates : formulation and application to roadway crash frequency at intersections Born, Kathryn Mary 24 March 2014 (has links) This thesis proposes an estimation approach for count data models with endogenous covariates. The maximum approximate composite marginal likelihood inference approach is used to estimate model parameters. The modeling framework is applied to predict crash frequency at urban intersections in Irving, Texas. The sample is drawn from the Texas Department of Transportation crash incident files for the year 2008. The results highlight the importance of accommodating endogeneity effects in count models. In addition, the results reveal the increased propensity for crashes at intersections with flashing lights, intersections with crest approaches, and intersections that are on frontage roads. / text Count data Treatment-outcome models Accident analysis Flashing light control Generalized ordered response
19	A framework for developing road risk indices using quantile regression based crash prediction model Wu, Hui, doctor of civil engineering 13 October 2011 (has links) Safety reviews of existing roads are becoming a popular practice of many agencies nationally and internationally. Knowing road safety information is of great importance to both policymakers in addressing safety concerns and travelers in managing their trips. There have been various efforts in developing methodologies to measure and assess road safety in an effective manner. However, the existing research and practices are still constrained by their subjective and reactive nature. The goal of this research is to develop a framework of Road Risk Indices (RRIs) to assess road risks of existing highway infrastructure for both road users and agencies based on road geometrics, traffic conditions, and historical crash data. The proposed RRIs are intended to give a comprehensive and objective view of road safety, so that safety problems can be identified at an early stage before they rise in the form of accidents. A methodological framework of formulating RRIs that integrates results from crash prediction models and historical crash data is proposed, and Linear Referencing tools in the ArcGIS software are used to develop digital maps to publish estimated RRIs. These maps provide basic Geographic Information System (GIS) functions, including viewing and querying RRIs, and performing spatial analysis tasks. A semi-parameter count model and quantile regression based estimation are proposed to capture the specific characteristics of crash data and provide more robust and accurate predictions on crash counts. Crash data collected on Interstate Highways in Washington State for the year 2002 was extracted from the Highway Safety Information System (HSIS) and used for the case study. The results from the case study show that the proposed framework is capable of capturing statistical correlations between traffic crashes and influencing factors, leading to the effective integration of safety information in composite indices. / text Highway safety Road Risk Indices (RRI) Quantile regression on count data Crash prediction model
20	Μελέτες στις ταχέως εξαπλωνόμενες χρηματοοικονομικές κρίσεις με την χρήση υποδειγμάτων απαριθμητών τιμών και δεδομένων διάρκειας Σιάκουλης, Βασίλειος 18 June 2014 (has links) Στην παρούσα διατριβή πραγματοποιούμε εμπειρικές αναλύσεις στο αντικείμενο της Χρηματοοικονομικής Μόλυνσης στον χρηματοπιστωτικό τομέα και στις αγορές μετοχών και ομολόγων, με την χρήση υποδειγμάτων απαραριθμητών δεδομένων και δεδομένων διάρκειας. / In the current thesis we focus on Financial Contagion modeling in the domain of bank failures, stock and bond markets, with the use of Count and Duration data models. Απαριθμητά δεδομένα Δεδομένα διάρκειας Κρίση 338.542 Financial contagion Count data Duration fata Crisis

Search results