Global ETD Search

11	Um aplicativo shiny para modelos lineares generalizados / A shiny app to perform generalized linear models Saavedra, Cayan Atreio Portela Bárcena 01 October 2018 (has links) Recentes avanços tecnológicos e computacionais trouxeram alternativas que acarretaram em mudanças na forma com que se faz análises e visualizações de dados. Uma dessas mudanças caracteriza-se no uso de plataformas interativas e gráficos dinâmicos para a realização de tais análises. Desta maneira, análises e visualizações de dados não se limitam mais a um ambiente estático, de modo que, explorar a interatividade pode possibilitar um maior leque na investigação e apresentação dos dados. O presente trabalho tem como objetivo propor um aplicativo interativo, de fácil uso e interface amigável, que viabilize estudos, análises descritivas e ajustes de modelos lineares generalizados. Este aplicativo é feito utilizando o pacote shiny no ambiente R de computação estatística com a proposta de atuar como ferramenta de apoio para a pesquisa e ensino da estatística. Usuários sem afinidade em programação podem explorar os dados e realizar o ajuste de modelos lineares generalizados sem digitar uma linha código. Em relação ao ensino, a dinâmica e interatividade do aplicativo proporcionam ao aluno uma investigação descomplicada de métodos envolvidos, tornando mais fácil a assimilação de conceitos relacionados ao tema. / Recent technological and computational advances have brought alternatives that have led to changes in the way data analyzes and visualizations are done. One of these changes is characterized by the use of interactive platforms and dynamic graphics to carry out such analyzes. In this way, data analyzes and visualizations are no longer limited to a static environment, so exploring this dynamic interactivity can enable a wider range of data exploration and presentation. The present work aims to propose an interactive application, easy to use and with user-friendly interface, which enables studies and descriptive analysis and fit generalized linear models. This application is made using the shiny package in the R environment of statistical computing. The purpose of the application is to act as a support tool for statistical research and teaching. Users with no familiarity in programming can explore the data and perform the fit of generalized linear models without typing a single code line. Regarding teaching, the dynamics and interactivity of the application gives the student an uncomplicated way to investigate the methods involved, making it easier to assimilate concepts related to the subject. shiny shiny Dynamic graphics Generalized linear models Gráficos dinâmicos Interactivity Interatividade Modelos lineares generalizados
12	Diretrizes para aplicação de inferência Bayesiana aproximada para modelos lineares generalizados e dados georreferenciados / Approximate Bayesian inference guidelines for generalized linear models and georeferenced data Frade, Djair Durand Ramalho 15 August 2018 (has links) Neste trabalho, exploramos e propusemos diretrizes para a análise de dados utilizando o método Integrated Nested Laplace Approxímation - INLA para os modelos lineares generalizados (MLG\'s) e modelos baseados em dados georreferenciados. No caso dos MLG\'s, verificou-se o impacto do método de aproximação utilizado para aproximar a distribuição a posteriori conjunta. Nos dados georreferenciados, avaliou-se e propôs-se diretrizes para construção das malhas, passo imprescindível para obtenção de resultados mais precisos. Em ambos os casos, foram realizados estudos de simulação. Para selecionar os melhores modelos, foram calculadas medidas de concordância entre as observações e os valores ajustados pelos modelos, por exemplo, erro quadrático médio e taxa de cobertura. / In this work, we explore and propose guidelines for data analysis using the Integrated Nested Laplace Approximation (INLA) method for generalized linear models (GLM) and models based on georeferenced data. In the case of GLMs, the impact of the approximation method used to approximate the a posteriori joint distribution was verified. In the georeferenced data, we evaluated and proposed guidelines for the construction of the meshes, an essential step for obtaining more precise results. In both cases, simulation studies were performed. To select the best models, agreement measures were calculated between observations and models, for example, mean square error and coverage rate. Bayesian inference Generalized linear models Inferência bayesiana INLA INLA Meshes Modelos lineares generalizados
13	Implementação em R de modelos de regressão binária com ligação paramétrica / R implementation of binary regression models with parametric link Santos, Bernardo Pereira dos 27 February 2013 (has links) A análise de dados binários é usualmente feita através da regressão logística, mas esse modelo possui limitações. Modificar a função de ligação da regressão permite maior flexibilidade na modelagem e diversas propostas já foram feitas nessa área. No entanto, não se sabe de nenhum pacote estatístico capaz de estimar esses modelos, o que dificulta sua utilização. O presente trabalho propõe uma implementação em R de quatro modelos de regressão binária com função de ligação paramétrica usando tanto a abordagem frequentista como a Bayesiana. / Binary data analysis is usually conducted with logistic regression, but this model has limitations. Modifying the link function allows greater flexibility in modelling and several proposals have been made on the field. However, to date there are no packages capable of estimating these models imposing some difficulties to utilize them. The present work develops an R implementation of four binary regression models with parametric link functions in both frequentist and Bayesian approaches. binary data dados binarios função de ligação generalized linear models link function modelos lineares generalizados
14	Um aplicativo shiny para modelos lineares generalizados / A shiny app to perform generalized linear models Cayan Atreio Portela Bárcena Saavedra 01 October 2018 (has links) Recentes avanços tecnológicos e computacionais trouxeram alternativas que acarretaram em mudanças na forma com que se faz análises e visualizações de dados. Uma dessas mudanças caracteriza-se no uso de plataformas interativas e gráficos dinâmicos para a realização de tais análises. Desta maneira, análises e visualizações de dados não se limitam mais a um ambiente estático, de modo que, explorar a interatividade pode possibilitar um maior leque na investigação e apresentação dos dados. O presente trabalho tem como objetivo propor um aplicativo interativo, de fácil uso e interface amigável, que viabilize estudos, análises descritivas e ajustes de modelos lineares generalizados. Este aplicativo é feito utilizando o pacote shiny no ambiente R de computação estatística com a proposta de atuar como ferramenta de apoio para a pesquisa e ensino da estatística. Usuários sem afinidade em programação podem explorar os dados e realizar o ajuste de modelos lineares generalizados sem digitar uma linha código. Em relação ao ensino, a dinâmica e interatividade do aplicativo proporcionam ao aluno uma investigação descomplicada de métodos envolvidos, tornando mais fácil a assimilação de conceitos relacionados ao tema. / Recent technological and computational advances have brought alternatives that have led to changes in the way data analyzes and visualizations are done. One of these changes is characterized by the use of interactive platforms and dynamic graphics to carry out such analyzes. In this way, data analyzes and visualizations are no longer limited to a static environment, so exploring this dynamic interactivity can enable a wider range of data exploration and presentation. The present work aims to propose an interactive application, easy to use and with user-friendly interface, which enables studies and descriptive analysis and fit generalized linear models. This application is made using the shiny package in the R environment of statistical computing. The purpose of the application is to act as a support tool for statistical research and teaching. Users with no familiarity in programming can explore the data and perform the fit of generalized linear models without typing a single code line. Regarding teaching, the dynamics and interactivity of the application gives the student an uncomplicated way to investigate the methods involved, making it easier to assimilate concepts related to the subject. shiny Gráficos dinâmicos Interatividade Modelos lineares generalizados shiny Dynamic graphics Generalized linear models Interactivity
15	Implementação em R de modelos de regressão binária com ligação paramétrica / R implementation of binary regression models with parametric link Bernardo Pereira dos Santos 27 February 2013 (has links) A análise de dados binários é usualmente feita através da regressão logística, mas esse modelo possui limitações. Modificar a função de ligação da regressão permite maior flexibilidade na modelagem e diversas propostas já foram feitas nessa área. No entanto, não se sabe de nenhum pacote estatístico capaz de estimar esses modelos, o que dificulta sua utilização. O presente trabalho propõe uma implementação em R de quatro modelos de regressão binária com função de ligação paramétrica usando tanto a abordagem frequentista como a Bayesiana. / Binary data analysis is usually conducted with logistic regression, but this model has limitations. Modifying the link function allows greater flexibility in modelling and several proposals have been made on the field. However, to date there are no packages capable of estimating these models imposing some difficulties to utilize them. The present work develops an R implementation of four binary regression models with parametric link functions in both frequentist and Bayesian approaches. dados binarios função de ligação modelos lineares generalizados binary data generalized linear models link function
16	Modeling time series data with semi-reflective boundaries Johnson, Amy May 01 December 2013 (has links) High frequency time series data have become increasingly common. In many settings, such as the medical sciences or economics, these series may additionally display semi-reflective boundaries. These are boundaries, either physically existing, arbitrarily set, or determined based on inherent qualities of the series, which may be exceeded and yet based on probable consequences offer incentives to return to mid-range levels. In a lane control setting, Dawson, Cavanaugh, Zamba, and Rizzo (2010) have previously developed a weighted third-order autoregressive model utilizing flat, linear, and quadratic projections with a signed error term in order to depict key features of driving behavior, where the probability of a negative residual is predicted via logistic regression. In this driving application, the intercept (Λ0) of the logistic regression model describes the central tendency of a particular driver while the slope parameter (Λ1 ) can be intuitively defined as a representation of the propensity of the series to return to mid-range levels. We call this therefore the "re-centering" parameter, though this is a slight misnomer since the logistic model does not describe the position of the series, but rather the probability of a negative residual. In this framework a multi-step estimation algorithm, which we label as the Single-Pass method, was provided. In addition to investigating the statistical properties of the Single-Pass method, several other estimation techniques are investigated. These techniques include an Iterated Grid Search, which utilizes the underlying likelihood model, and four modified versions of the Single-Pass method. These Modified Single-Pass (MSP) techniques utilize respectively unconstrained least squares estimation for the vector of projection coefficients (Β), use unconstrained linear regression with a post-hoc application of the summation constraint, reduce the regression model to include only the flat and linear projections, or implement the Least Absolute Shrinkage and Selection Operator (LASSO). For each of these techniques, mean bias, confidence intervals, and coverage probabilities were calculated which indicated that of the modifications only the first two were promising alternatives. In a driving application, we therefore considered these two modified techniques along with the Single-Pass and Iterative Grid Search. It was found that though each of these methods remains biased with generally lower than ideal coverage probabilities, in a lane control setting they are each able to distinguish between two populations based on disease status. It has also been found that the re-centering parameter, estimated based on data collected in a driving simulator amongst a control population, is significantly correlated with neuropsychological outcomes as well as driving errors performed on-road. Several of these correlations were apparent regardless of the estimation technique, indicating real-world validity of the model across related assessments. Additionally, the Iterated Grid Search produces estimates that are most distinct with generally lower bias and improved coverage with the exception of the estimate of Λ1. However this method also requires potentially large time and memory commitments as compared to the other techniques considered. Thus the optimal estimation scheme is dependent upon the situation. When feasible the Iterated Grid Search appears to be the best overall method currently available. However if time or memory is a limiting factor, or if a reliable estimate of the re-centering parameter with reasonably accurate estimation of the Β vector is desired, the Modified Single-Pass technique utilizing unconstrained linear regression followed by implementation of the summation constraint is a sensible alternative. Alzheimer's Disease Boundaries Driving Generalized Linear Models Parameter Estimation Time Series Biostatistics
17	The Turkish Catastrophe Insurance Pool Claims Modeling 2000-2008 Data Saribekir, Gozde 01 March 2013 (has links) (PDF) After the 1999 Marmara Earthquake, social, economic and engineering studies on earthquakes became more intensive. The Turkish Catastrophe Insurance Pool (TCIP) was established after the Marmara Earthquake to share the deficit in the budget of the Government. The TCIP has become a data source for researchers, consisting of variables such as number of claims, claim amount and magnitude. In this thesis, the TCIP earthquake claims, collected between 2000 and 2008, are studied. The number of claims and claim payments (aggregate claim amount) are modeled by using Generalized Linear Models (GLM). Observed sudden jumps in claim data are represented by using the exponential kernel function. Model parameters are estimated by using the Maximum Likelihood Estimation (MLE). The results can be used as recommendation in the computation of expected value of the aggregate claim amounts and the premiums of the TCIP. HA Statistics 36161
18	Bayesian Semiparametric Models for Heterogeneous Cross-platform Differential Gene Expression Dhavala, Soma Sekhar 2010 December 1900 (has links) We are concerned with testing for differential expression and consider three different aspects of such testing procedures. First, we develop an exact ANOVA type model for discrete gene expression data, produced by technologies such as a Massively Parallel Signature Sequencing (MPSS), Serial Analysis of Gene Expression (SAGE) or other next generation sequencing technologies. We adopt two Bayesian hierarchical models—one parametric and the other semiparametric with a Dirichlet process prior that has the ability to borrow strength across related signatures, where a signature is a specific arrangement of the nucleotides. We utilize the discreteness of the Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using non-parametric approaches, while controlling the false discovery rate. Next, we consider ways to combine expression data from different studies, possibly produced by different technologies resulting in mixed type responses, such as Microarrays and MPSS. Depending on the technology, the expression data can be continuous or discrete and can have different technology dependent noise characteristics. Adding to the difficulty, genes can have an arbitrary correlation structure both within and across studies. Performing several hypothesis tests for differential expression could also lead to false discoveries. We propose to address all the above challenges using a Hierarchical Dirichlet process with a spike-and-slab base prior on the random effects, while smoothing splines model the unknown link functions that map different technology dependent manifestations to latent processes upon which inference is based. Finally, we propose an algorithm for controlling different error measures in a Bayesian multiple testing under generic loss functions, including the widely used uniform loss function. We do not make any specific assumptions about the underlying probability model but require that indicator variables for the individual hypotheses are available as a component of the inference. Given this information, we recast multiple hypothesis testing as a combinatorial optimization problem and in particular, the 0-1 knapsack problem which can be solved efficiently using a variety of algorithms, both approximate and exact in nature. Bayesian Models Generalized linear models Semiparametric models Dirichlet process Meta-analysis Multiple hypothesis testing Bioinformatics
19	Testing Lack-of-Fit of Generalized Linear Models via Laplace Approximation Glab, Daniel Laurence 2011 May 1900 (has links) In this study we develop a new method for testing the null hypothesis that the predictor function in a canonical link regression model has a prescribed linear form. The class of models, which we will refer to as canonical link regression models, constitutes arguably the most important subclass of generalized linear models and includes several of the most popular generalized linear models. In addition to the primary contribution of this study, we will revisit several other tests in the existing literature. The common feature among the proposed test, as well as the existing tests, is that they are all based on orthogonal series estimators and used to detect departures from a null model. Our proposal for a new lack-of-fit test is inspired by the recent contribution of Hart and is based on a Laplace approximation to the posterior probability of the null hypothesis. Despite having a Bayesian construction, the resulting statistic is implemented in a frequentist fashion. The formulation of the statistic is based on characterizing departures from the predictor function in terms of Fourier coefficients, and subsequent testing that all of these coefficients are 0. The resulting test statistic can be characterized as a weighted sum of exponentiated squared Fourier coefficient estimators, whereas the weights depend on user-specified prior probabilities. The prior probabilities provide the investigator the flexibility to examine specific departures from the prescribed model. Alternatively, the use of noninformative priors produces a new omnibus lack-of-fit statistic. We present a thorough numerical study of the proposed test and the various existing orthogonal series-based tests in the context of the logistic regression model. Simulation studies demonstrate that the test statistics under consideration possess desirable power properties against alternatives that have been identified in the existing literature as being important. BIC generalized linear models Laplace approximation local alternatives nonparametric lack-of-fit test orthogonal series
20	Ideology and interests : a hierarchical Bayesian approach to spatial party preferences Mohanty, Peter Cushner 04 December 2013 (has links) This paper presents a spatial utility model of support for multiple political parties. The model includes a "valence" term, which I reparameterize to include both party competence and the voters' key sociodemographic concerns. The paper shows how this spatial utility model can be interpreted as a hierarchical model using data from the 2009 European Elections Study. I estimate this model via Bayesian Markov Chain Monte Carlo (MCMC) using a block Gibbs sampler and show that the model can capture broad European-wide trends while allowing for significant amounts of heterogeneity. This approach, however, which assumes a normal dependent variable, is only able to partially reproduce the data generating process. I show that the data generating process can be reproduced more accurately with an ordered probit model. Finally, I discuss trade-offs between parsimony and descriptive richness and other practical challenges that may be encountered when v building models of party support and make recommendations for capturing the best of both approaches. / text Hierarchical models Generalized linear models Markov Chain Monte Carlo (MCMC) Public opinion Political parties European Union

Search results