Global ETD Search

31	GAMLSSs with applications to zero inflated and hierarquical data / GAMLSSs com aplicações a dados inflacionados de zeros e hierárquicos Thomas, Gustavo 20 December 2017 (has links) The generalized additive models for location, scale and shape (GAMLSS) developed by Rigby and Stasinopoulos (2005) are a general class of univariate regression models that do not have the response distribution restricted to the exponential family as do the generalized linear and additive models, for example. In addition, they allow all the parameters of the response variable distribution to be modeled explicitly through different sets of explanatory variables. The semiparametric subclass of GAMLSS, in particular, accepts a wide range of parametric and nonparametric terms to be included in the predictors of the parameters. Similar to the generalized linear models, the GAMLSSs link predictors to parameters through monotonic link functions, which can also change for each parameter. This dissertation describes the GAMLSSs methodology and presents two applications to data sets provenient from experiments in agronomy; exploring methods of estimation, diagnosis and comparison of these models. / Os modelos lineares generalizados para locação, escala e forma (GAMLSS) desenvolvidos por Rigby e Stasinopoulos (2005) são uma ampla classe de modelos de regressão univariados que não pressupõem que a distribuição da variável resposta pertença à família exponencial como os modelos lineares generalizados ou aditivos generalizados, por exemplo. Além do mais, eles permitem que todos os parâmetros da distribuição da variável resposta sejam modelados explicitamente por meio de diferentes conjuntos de variáveis explanatórias. A subclasse semiparamétrica dos GAMLSS, em particular, permite que uma grande variedade de termos paramétricos e não paramétricos sejam incluídos nos preditores dos parâmetros da distribuição assumida para a variável resposta. De forma análoga aos modelos lineares generalizados, os GAMLSSs ligam os preditores aos parâmetros por meio de funções de ligação monótonas, que também podem mudar de acordo com o parâmetro a ser estimado. Esta dissertação descreve a metodologia dos modelos lineares generalizados para locação, escala e forma e apresenta duas aplicações a bancos de dados provenientes de experimentos agrícolas; explorando métodos de estimação, diagnóstico e comparação desse tipo de modelos. Corn growth Count of roots GAMLSS Mixed GAMLSSs Modelos mistos Software R Zero inflated GAMLSSs
32	Structural time series clustering, modeling, and forecasting in the state-space framework Tang, Fan 15 December 2015 (has links) This manuscript consists of two papers that formulate novel methodologies pertaining to time series analysis in the state-space framework. In Chapter 1, we introduce an innovative time series forecasting procedure that relies on model-based clustering and model averaging. The clustering algorithm employs a state-space model comprised of three latent structures: a long-term trend component; a seasonal component, to capture recurring global patterns; and an anomaly component, to reflect local perturbations. A two-step clustering algorithm is applied to identify series that are both globally and locally correlated, based on the corresponding smoothed latent structures. For each series in a particular cluster, a set of forecasting models is fit, using covariate series from the same cluster. To fully utilize the cluster information and to improve forecasting for a series of interest, multi-model averaging is employed. We illustrate the proposed technique in an application that involves a collection of monthly disease incidence series. In Chapter 2, to effectively characterize a count time series that arises from a zero-inflated binomial (ZIB) distribution, we propose two classes of statistical models: a class of observation-driven ZIB (ODZIB) models, and a class of parameter-driven ZIB (PDZIB) models. The ODZIB model is formulated in the partial likelihood framework. Common iterative algorithms (Newton-Raphson, Fisher Scoring, and Expectation Maximization) can be used to obtain the maximum partial likelihood estimators (MPLEs). The PDZIB model is formulated in the state-space framework. For parameter estimation, we devise a Monte Carlo Expectation Maximization (MCEM) algorithm, using particle methods to approximate the intractable conditional expectations in the E-step of the algorithm. We investigate the efficacy of the proposed methodology in a simulation study, and illustrate its utility in a practical application pertaining to disease coding. Forecasting Model averaging Particle methods State-space modeling Time Series Clustering Zero-inflated binomial time series Biostatistics
33	Should large urban centres decide how best to use health care services? Clarke, Suzanne Kathleen 17 February 2014 (has links) We assessed how estimates of need-expected inpatient hospital use differ depending on whether need-expected use was estimated for a population of all Canadians, Canadian health regions, or a subpopulation of higher income Canadians, who likely had minimal healthcare access problems. Data came from the 2009/2010 Canadian Community Health Survey, a national cross-sectional survey. Using zero-inflated negative binomial regression, we modeled inpatient hospital use separately based on the three aforementioned choices of population. We adjusted for demographic, health behaviour, health status, socioeconomic, and health care supply factors. We then estimated need-expected inpatient hospital use and compared the estimates across individuals and by income and province. The three choices of population that we used in this study had similar results. Our estimates of the average need-expected use by province or income group were not sensitive to the choice of population used to estimate need-expected use. health services/utilization population based planning health services research/methods needs assessment
34	Variants of compound models and their application to citation analysis Low, Wan Jing January 2017 (has links) This thesis develops two variant statistical models for count data based upon compound models for contexts when the counts may be viewed as derived from two generations, which may or may not be independent. Unlike standard compound models, the variants model the sum of both generations. We consider cases where both generations are negative binomial or one is Poisson and the other is negative binomial. The first variant, denoted SVA, follows a zero restriction, where a zero in the first generation will automatically be followed by a zero in the second generation. The second variant, denoted SVB, is a convolution model that does not possess this zero restriction. The main properties of the SVA and SVB models are outlined and compared with standard compound models. The results show that the SVA distributions are similar to standard compound distributions for some fixed parameters. Comparisons of SVA, Poisson hurdle, negative binomial hurdle and their zero-inflated counterpart using simulated SVA data indicate that different models can give similar results, as the generating models are not always selected as the best fitting. This thesis focuses on the use of the variant models to model citation counts. We show that the SVA models are more suitable for modelling citation data than other previously used models such as the negative binomial model. Moreover, the application of SVA and SVB models may be used to describe the citation process. This thesis also explores model selection techniques based on log-likelihood methods, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The suitability of the models is also assessed using two diagrammatic methods, randomised quantile residual plots and Christmas tree plots. The Christmas tree plots clearly illustrate whether the observed data are within fluctuation bounds under the fitted model, but the randomised quantile residual plots utilise the cumulative distribution, and hence are insensitive to individual data values. Both plots show the presence of citation counts that are larger than expected under the fitted model in the data sets. 519.2
35	GAMLSSs with applications to zero inflated and hierarquical data / GAMLSSs com aplicações a dados inflacionados de zeros e hierárquicos Gustavo Thomas 20 December 2017 (has links) The generalized additive models for location, scale and shape (GAMLSS) developed by Rigby and Stasinopoulos (2005) are a general class of univariate regression models that do not have the response distribution restricted to the exponential family as do the generalized linear and additive models, for example. In addition, they allow all the parameters of the response variable distribution to be modeled explicitly through different sets of explanatory variables. The semiparametric subclass of GAMLSS, in particular, accepts a wide range of parametric and nonparametric terms to be included in the predictors of the parameters. Similar to the generalized linear models, the GAMLSSs link predictors to parameters through monotonic link functions, which can also change for each parameter. This dissertation describes the GAMLSSs methodology and presents two applications to data sets provenient from experiments in agronomy; exploring methods of estimation, diagnosis and comparison of these models. / Os modelos lineares generalizados para locação, escala e forma (GAMLSS) desenvolvidos por Rigby e Stasinopoulos (2005) são uma ampla classe de modelos de regressão univariados que não pressupõem que a distribuição da variável resposta pertença à família exponencial como os modelos lineares generalizados ou aditivos generalizados, por exemplo. Além do mais, eles permitem que todos os parâmetros da distribuição da variável resposta sejam modelados explicitamente por meio de diferentes conjuntos de variáveis explanatórias. A subclasse semiparamétrica dos GAMLSS, em particular, permite que uma grande variedade de termos paramétricos e não paramétricos sejam incluídos nos preditores dos parâmetros da distribuição assumida para a variável resposta. De forma análoga aos modelos lineares generalizados, os GAMLSSs ligam os preditores aos parâmetros por meio de funções de ligação monótonas, que também podem mudar de acordo com o parâmetro a ser estimado. Esta dissertação descreve a metodologia dos modelos lineares generalizados para locação, escala e forma e apresenta duas aplicações a bancos de dados provenientes de experimentos agrícolas; explorando métodos de estimação, diagnóstico e comparação desse tipo de modelos. GAMLSS Modelos mistos Software R Corn growth Count of roots Mixed GAMLSSs Zero inflated GAMLSSs
36	Gender Differences in HIV Sexual Risk Behaviors Among Clients of Substance Use Disorder Treatment Programs in the U.S. Pan, Yue, Metsch, Lisa R., Wang, Weize, Wang, Ke Sheng, Duan, Rui, Kyle, Tiffany L., Gooden, Lauren K., Feaster, Daniel 01 May 2017 (has links) This study examined differences in sexual risk behaviors by gender and over time among 1281 patients (777 males and 504 females) from 12 community-based substance use disorder treatment programs throughout the United States participating in CTN-0032, a randomized control trial conducted within the National Drug Abuse Treatment Clinical Trials Network. Zero-inflated negative binomial and negative binomial models were used in the statistical analysis. Results indicated significant reductions in most types of sexual risk behaviors among substance users regardless of the intervention arms. There were also significant gender differences in sexual risk behaviors. Men (compared with women) reported more condomless sex acts with their non-primary partners (IRR = 1.80, 95 % CI 1.21–2.69) and condomless anal sex acts (IRR = 1.74, 95 % CI 1.11–2.72), but fewer condomless sex partners (IRR = 0.87, 95 % CI 0.77–0.99), condomless vaginal sex acts (IRR = 0.83, 95 % CI 0.69–1.00), and condomless sex acts within 2 h of using drugs or alcohol (IRR = 0.70, 95 % CI 0.53–0.90). Gender-specific intervention approaches are called for in substance use disorder treatment. condomless sex gender difference sexual risk behavior substance use disorders treatment zero-inflated negative binomial models Biostatistics and Epidemiology
37	Analysis of Zero-Heavy Data Using a Mixture Model Approach Wang, Shin Cheng 30 March 1998 (has links) The problem of high proportion of zeroes has long been an interest in data analysis and modeling, however, there are no unique solutions to this problem. The solution to the individual problem really depends on its particular situation and the design of the experiment. For example, different biological, chemical, or physical processes may follow different distributions and behave differently. Different mechanisms may generate the zeroes and require different modeling approaches. So it would be quite impossible and inflexible to come up with a unique or a general solution. In this dissertation, I focus on cases where zeroes are produced by mechanisms that create distinct sub-populations of zeroes. The dissertation is motivated from problems of chronic toxicity testing which has a data set that contains a high proportion of zeroes. The analysis of chronic test data is complicated because there are two different sources of zeroes: mortality and non-reproduction in the data. So researchers have to separate zeroes from mortality and fecundity. The use of mixture model approach which combines the two mechanisms to model the data here is appropriate because it can incorporate the mortality kind of extra zeroes. A zero inflated Poisson (ZIP) model is used for modeling the fecundity in <i> Ceriodaphnia dubia</i> toxicity test. A generalized estimating equation (GEE) based ZIP model is developed to handle longitudinal data with zeroes due to mortality. A joint estimate of inhibition concentration (ICx) is also developed as potency estimation based on the mixture model approach. It is found that the ZIP model would perform better than the regular Poisson model if the mortality is high. This kind of toxicity testing also involves longitudinal data where the same subject is measured for a period of seven days. The GEE model allows the flexibility to incorporate the extra zeroes and a correlation structure among the repeated measures. The problem of zero-heavy data also exists in environmental studies in which the growth or reproduction rates of multi-species are measured. This gives rise to multivariate data. Since the inter-relationships between different species are imbedded in the correlation structure, the study of the information in the correlation of the variables, which is often accessed through principal component analysis, is one of the major interests in multi-variate data. In the case where mortality influences the variables of interests, but mortality is not the subject of interests, the use of the mixture approach can be applied to recover the information of the correlation structure. In order to investigate the effect of zeroes on multi-variate data, simulation studies on principal component analysis are performed. A method that recovers the information of the correlation structure is also presented. / Ph. D. Principal Component Analysis Longitudinal Data Inhibition Concentration Generalized Estimating Equations Chronic toxicity testing Ceriodaphnia Dubia Zero-inflated Poisson
38	Statistical developments for understanding anthropogenic impacts on marine ecosystems Marshall, Laura January 2012 (has links) Over the past decades technological developments have both changed and increased human in influence on the marine environment. We now have greater potential than ever before to introduce disturbance and deplete marine resources. Two of the issues currently under public scrutiny are the exploitation of fish stocks worldwide and levels of anthropogenic noise in the marine environment. The aim of this thesis is to investigate and develop novel analyses and simulations to provide additional insight into some of the challenges facing the marine ecosystem today. These methodologies will improve the management of these risks to marine ecosystems. This thesis first addresses the issue of competition between humans and grey seals (Halichoerus grypus) for marine resources, providing compelling evidence that a substantial proportion of the sandeels consumed by grey seals in the North Sea are in fact H. lanceolatus, which is not commercially exploited, rather than the commercially important A. marinus. In addition, we present quantitative results regarding sources of bias when estimating the total biomass of sandeels consumed by grey seals. Secondly, we investigate spatially adaptive 2-dimensional smoothing to improve the prediction of both the presence and density of marine species, information that is often key in the management of marine ecosystems. Particularly, we demonstrate the benefits of such methods in the prediction of sandeel occurrence. Lastly this thesis provides a quantitative assessment of the protocols for real-time monitoring of marine mammal presence, which require that acoustic operations cease when an animal is detected within a certain distance (i.e. the "monitoring zone") of the sound source. We assess monitoring zones of different sizes with regards to their effectiveness in reducing the risks of temporary and permanent damage to the animals' hearing, and demonstrate that a monitoring zone of 2 km is generally recommendable. 577.7
39	Engraulis anchoita (Clupeiformes: Engraulidae) eggs and larvae in the Southeastern Brazilian Bight: new perspectives from a historical data set (1974 - 2010) / Engraulis anchoita (Clupeiformes: Engraulidae) ovos e larvas na Plataforma Continental Sudeste do Brasil: novas perspectivas a partir de um conjunto de dados históricos (1974 - 2010) Favero, Jana Menegassi Del 23 August 2016 (has links) The main objective of this dissertation was to evaluate long-term fluctuations in the distribution and abundance of Engraulis anchoita eggs and larvae in the Southeastern Brazilian Bight (SBB). Engraulis anchoita is a fish species that is ecologically and economically important. We analyzed samples and abiotic data from eighteen oceanographic cruises conducted during austral late spring and early summer from 1974 to 2010. Two different stocks were detected in the SBB based on egg size, with the predominant stock in the area having smaller eggs than the stock in the region further south. Using indicative kriging, we identified occasional (e.g. Florianópolis - 27°S and off Santos Bay) and avoided (e.g. off São Sebastião Island and off Cananéia-Iguape Coastal System) spawning sites. Through zero-inflated models, spatial factors (different areas and the local depth) were related to the probability of sampling false zeros and temporal and oceanographic conditions (different years and temperature) with egg and larvae abundance. We also described faster and more accurate methodology to identify E. anchoita eggs, and compared the mesh-size efficiency to sample eggs and analyzed how egg size varied seasonally. Our results may support future studies and may assist a future fishery management of E. anchoita, a species not yet exploited in the SBB. / O principal objetivo dessa tese foi analisar as flutuações de longo-prazo na distribuição e abundância de ovos e larvas de Engraulias anchoita, uma espécie de peixe de importância econômica e ecológica, na Plataforma Continental Sudeste do Brasil (PCSE). Nós analisamos amostras e dados abióticos de dezoito cruzeiros oceanográficos realizados durante o fim da primavera e o começo do verão de 1974 a 2010. Dois estoques distintos foram identificados com base no tamanho dos ovos, um predominante e com menor tamanho e outro de maior tamanho ao sul da PCSE. Através de \"krigagem\" indicativa, foram identificadas áreas de desova ocasional (como ao norte de Florianópolis e a área ao largo da baía de Santos) e áreas em que a desova foi evitada (como em frente à Ilha de São Sebastião e ao Sistema Costeiro Cananéia-Iguape). Usando modelos inflacionados de zeros, os fatores espaciais (diferentes áreas e profundidades amostradas) foram relacionados com a probabilidade de se amostrar falso zero, enquanto os fatores temporais e oceanográficos (diferentes anos e temperatura) foram relacionados com a abundância de ovos e larvas. Apresentamos também uma metodologia mais rápida e mais eficiente para identificar os ovos de E. anchoita, comparamos as amostragens realizadas com duas malhagens diferentes e analisamos variações sazonais do tamanho dos ovos capturados. Assim, nossos resultados poderão auxiliar estudos futuros e também no manejo pesqueiro da espécie em questão, ainda não explorada comercialmente na área de estudo. áreas de desova estoques pesqueiros fish stocks flutuações de longo-prazo ichthyoplankton ictioplâncton long-term fluctuations modelos inflacionados de zeros. spawning sites zero- inflated models (ZI)
40	Engraulis anchoita (Clupeiformes: Engraulidae) eggs and larvae in the Southeastern Brazilian Bight: new perspectives from a historical data set (1974 - 2010) / Engraulis anchoita (Clupeiformes: Engraulidae) ovos e larvas na Plataforma Continental Sudeste do Brasil: novas perspectivas a partir de um conjunto de dados históricos (1974 - 2010) Jana Menegassi Del Favero 23 August 2016 (has links) The main objective of this dissertation was to evaluate long-term fluctuations in the distribution and abundance of Engraulis anchoita eggs and larvae in the Southeastern Brazilian Bight (SBB). Engraulis anchoita is a fish species that is ecologically and economically important. We analyzed samples and abiotic data from eighteen oceanographic cruises conducted during austral late spring and early summer from 1974 to 2010. Two different stocks were detected in the SBB based on egg size, with the predominant stock in the area having smaller eggs than the stock in the region further south. Using indicative kriging, we identified occasional (e.g. Florianópolis - 27°S and off Santos Bay) and avoided (e.g. off São Sebastião Island and off Cananéia-Iguape Coastal System) spawning sites. Through zero-inflated models, spatial factors (different areas and the local depth) were related to the probability of sampling false zeros and temporal and oceanographic conditions (different years and temperature) with egg and larvae abundance. We also described faster and more accurate methodology to identify E. anchoita eggs, and compared the mesh-size efficiency to sample eggs and analyzed how egg size varied seasonally. Our results may support future studies and may assist a future fishery management of E. anchoita, a species not yet exploited in the SBB. / O principal objetivo dessa tese foi analisar as flutuações de longo-prazo na distribuição e abundância de ovos e larvas de Engraulias anchoita, uma espécie de peixe de importância econômica e ecológica, na Plataforma Continental Sudeste do Brasil (PCSE). Nós analisamos amostras e dados abióticos de dezoito cruzeiros oceanográficos realizados durante o fim da primavera e o começo do verão de 1974 a 2010. Dois estoques distintos foram identificados com base no tamanho dos ovos, um predominante e com menor tamanho e outro de maior tamanho ao sul da PCSE. Através de \"krigagem\" indicativa, foram identificadas áreas de desova ocasional (como ao norte de Florianópolis e a área ao largo da baía de Santos) e áreas em que a desova foi evitada (como em frente à Ilha de São Sebastião e ao Sistema Costeiro Cananéia-Iguape). Usando modelos inflacionados de zeros, os fatores espaciais (diferentes áreas e profundidades amostradas) foram relacionados com a probabilidade de se amostrar falso zero, enquanto os fatores temporais e oceanográficos (diferentes anos e temperatura) foram relacionados com a abundância de ovos e larvas. Apresentamos também uma metodologia mais rápida e mais eficiente para identificar os ovos de E. anchoita, comparamos as amostragens realizadas com duas malhagens diferentes e analisamos variações sazonais do tamanho dos ovos capturados. Assim, nossos resultados poderão auxiliar estudos futuros e também no manejo pesqueiro da espécie em questão, ainda não explorada comercialmente na área de estudo. áreas de desova estoques pesqueiros flutuações de longo-prazo ictioplâncton modelos inflacionados de zeros. fish stocks ichthyoplankton long-term fluctuations spawning sites zero- inflated models (ZI)

Search results