Global ETD Search

1	Multivariate Multiscale Analysis of Neural Spike Trains Ramezan, Reza 10 December 2013 (has links) This dissertation introduces new methodologies for the analysis of neural spike trains. Biological properties of the nervous system, and how they are reflected in neural data, can motivate specific analytic tools. Some of these biological aspects motivate multiscale frameworks, which allow for simultaneous modelling of the local and global behaviour of neurons. Chapter 1 provides the preliminary background on the biology of the nervous system and details the concept of information and randomness in the analysis of the neural spike trains. It also provides the reader with a thorough literature review on the current statistical models in the analysis of neural spike trains. The material presented in the next six chapters (2-7) have been the focus of three papers, which have either already been published or are being prepared for publication. It is demonstrated in Chapters 2 and 3 that the multiscale complexity penalized likelihood method, introduced in Kolaczyk and Nowak (2004), is a powerful model in the simultaneous modelling of spike trains with biological properties from different time scales. To detect the periodic spiking activities of neurons, two periodic models from the literature, Bickel et al. (2007, 2008); Shao and Li (2011), were combined and modified in a multiscale penalized likelihood model. The contributions of these chapters are (1) employinh a powerful visualization tool, inter-spike interval (ISI) plot, (2) combining the multiscale method of Kolaczyk and Nowak (2004) with the periodic models ofBickel et al. (2007, 2008) and Shao and Li (2011), to introduce the so-called additive and multiplicative models for the intensity function of neural spike trains and introducing a cross-validation scheme to estimate their tuning parameters, (3) providing the numerical bootstrap confidence bands for the multiscale estimate of the intensity function, and (4) studying the effect of time-scale on the statistical properties of spike counts. Motivated by neural integration phenomena, as well as the adjustments for the neural refractory period, Chapters 4 and 5 study the Skellam process and introduce the Skellam Process with Resetting (SPR). Introducing SPR and its application in the analysis of neural spike trains is one of the major contributions of this dissertation. This stochastic process is biologically plausible, and unlike the Poisson process, it does not suffer from limited dependency structure. It also has multivariate generalizations for the simultaneous analysis of multiple spike trains. A computationally efficient recursive algorithm for the estimation of the parameters of SPR is introduced in Chapter 5. Except for the literature review at the beginning of Chapter 4, the rest of the material within these two chapters is original. The specific contributions of Chapters 4 and 5 are (1) introducing the Skellam Process with Resetting as a statistical tool to analyze neural spike trains and studying its properties, including all theorems and lemmas provided in Chapter 4, (2) the two fairly standard definitions of the Skellam process (homogeneous and inhomogeneous) and the proof of their equivalency, (3) deriving the likelihood function based on the observable data (spike trains) and developing a computationally efficient recursive algorithm for parameter estimation, and (4) studying the effect of time scales on the SPR model. The challenging problem of multivariate analysis of the neural spike trains is addressed in Chapter 6. As far as we know, the multivariate models which are available in the literature suffer from limited dependency structures. In particular, modelling negative correlation among spike trains is a challenging problem. To address this issue, the multivariate Skellam distribution, as well as the multivariate Skellam process, which both have flexible dependency structures, are developed. Chapter 5 also introduces a multivariate version of Skellam Process with Resetting (MSPR), and a so-called profile-moment likelihood estimation of its parameters. This chapter generalizes the results of Chapter 4 and 5, and therefore, except for the brief literature review provided at the beginning of the chapter, the remainder of the material is original work. In particular, the contributions of this chapter are (1) introducing multivariate Skellam distribution, (2) introducing two definitions of the Multivariate Skellam process in both homogeneous and inhomogeneous cases and proving their equivalence, (3) introducing Multivariate Skellam Process with Resetting (MSPR) to simultaneously model spike trains from an ensemble of neurons, and (4) utilizing the so-called profile-moment likelihood method to compute estimates of the parameters of MSPR. The discussion of the developed methodologies as well as the ``next steps'' are outlined in Chapter 7. spike train multiscale analysis inhomogeneous Poisson process Skellam process Skellam process with resetting multivariate Skellam
2	Analysis of Financial Data using a Difference-Poisson Autoregressive Model Baroud, Hiba January 2011 (has links) Box and Jenkins methodologies have massively contributed to the analysis of time series data. However, the assumptions used in these methods impose constraints on the type of the data. As a result, difficulties arise when we apply those tools to a more generalized type of data (e.g. count, categorical or integer-valued data) rather than the classical continuous or more specifically Gaussian type. Papers in the literature proposed alternate methods to model discrete-valued time series data, among these methods is Pegram's operator (1980). We use this operator to build an AR(p) model for integer-valued time series (including both positive and negative integers). The innovations follow the differenced Poisson distribution, or Skellam distribution. While the model includes the usual AR(p) correlation structure, it can be made more general. In fact, the operator can be extended in a way where it is possible to have components which contribute to positive correlation, while at the same time have components which contribute to negative correlation. As an illustration, the process is used to model the change in a stock’s price, where three variations are presented: Variation I, Variation II and Variation III. The first model disregards outliers; however, the second and third include large price changes associated with the effect of large volume trades and market openings. Parameters of the model are estimated using Maximum Likelihood methods. We use several model selection criteria to select the best order for each variation of the model as well as to determine which is the best variation of the model. The most adequate order for all the variations of the model is $AR(3)$. While the best fit for the data is Variation II, residuals' diagnostic plots suggest that Variation III represents a better correlation structure for the model. Pegrams's Operator Skellam Distribution Negative Correlations Actuarial Science
3	Analysis of Financial Data using a Difference-Poisson Autoregressive Model Baroud, Hiba January 2011 (has links) Box and Jenkins methodologies have massively contributed to the analysis of time series data. However, the assumptions used in these methods impose constraints on the type of the data. As a result, difficulties arise when we apply those tools to a more generalized type of data (e.g. count, categorical or integer-valued data) rather than the classical continuous or more specifically Gaussian type. Papers in the literature proposed alternate methods to model discrete-valued time series data, among these methods is Pegram's operator (1980). We use this operator to build an AR(p) model for integer-valued time series (including both positive and negative integers). The innovations follow the differenced Poisson distribution, or Skellam distribution. While the model includes the usual AR(p) correlation structure, it can be made more general. In fact, the operator can be extended in a way where it is possible to have components which contribute to positive correlation, while at the same time have components which contribute to negative correlation. As an illustration, the process is used to model the change in a stock’s price, where three variations are presented: Variation I, Variation II and Variation III. The first model disregards outliers; however, the second and third include large price changes associated with the effect of large volume trades and market openings. Parameters of the model are estimated using Maximum Likelihood methods. We use several model selection criteria to select the best order for each variation of the model as well as to determine which is the best variation of the model. The most adequate order for all the variations of the model is $AR(3)$. While the best fit for the data is Variation II, residuals' diagnostic plots suggest that Variation III represents a better correlation structure for the model. Pegrams's Operator Skellam Distribution Negative Correlations Actuarial Science
4	Modélisation bayésienne des changements aux niches écologiques causés par le réchauffement climatique Akpoué, Blache Paul 05 1900 (has links) Cette thèse présente des méthodes de traitement de données de comptage en particulier et des données discrètes en général. Il s'inscrit dans le cadre d'un projet stratégique du CRNSG, nommé CC-Bio, dont l'objectif est d'évaluer l'impact des changements climatiques sur la répartition des espèces animales et végétales. Après une brève introduction aux notions de biogéographie et aux modèles linéaires mixtes généralisés aux chapitres 1 et 2 respectivement, ma thèse s'articulera autour de trois idées majeures. Premièrement, nous introduisons au chapitre 3 une nouvelle forme de distribution dont les composantes ont pour distributions marginales des lois de Poisson ou des lois de Skellam. Cette nouvelle spécification permet d'incorporer de l'information pertinente sur la nature des corrélations entre toutes les composantes. De plus, nous présentons certaines propriétés de ladite distribution. Contrairement à la distribution multidimensionnelle de Poisson qu'elle généralise, celle-ci permet de traiter les variables avec des corrélations positives et/ou négatives. Une simulation permet d'illustrer les méthodes d'estimation dans le cas bidimensionnel. Les résultats obtenus par les méthodes bayésiennes par les chaînes de Markov par Monte Carlo (CMMC) indiquent un biais relatif assez faible de moins de 5% pour les coefficients de régression des moyennes contrairement à ceux du terme de covariance qui semblent un peu plus volatils. Deuxièmement, le chapitre 4 présente une extension de la régression multidimensionnelle de Poisson avec des effets aléatoires ayant une densité gamma. En effet, conscients du fait que les données d'abondance des espèces présentent une forte dispersion, ce qui rendrait fallacieux les estimateurs et écarts types obtenus, nous privilégions une approche basée sur l'intégration par Monte Carlo grâce à l'échantillonnage préférentiel. L'approche demeure la même qu'au chapitre précédent, c'est-à-dire que l'idée est de simuler des variables latentes indépendantes et de se retrouver dans le cadre d'un modèle linéaire mixte généralisé (GLMM) conventionnel avec des effets aléatoires de densité gamma. Même si l'hypothèse d'une connaissance a priori des paramètres de dispersion semble trop forte, une analyse de sensibilité basée sur la qualité de l'ajustement permet de démontrer la robustesse de notre méthode. Troisièmement, dans le dernier chapitre, nous nous intéressons à la définition et à la construction d'une mesure de concordance donc de corrélation pour les données augmentées en zéro par la modélisation de copules gaussiennes. Contrairement au tau de Kendall dont les valeurs se situent dans un intervalle dont les bornes varient selon la fréquence d'observations d'égalité entre les paires, cette mesure a pour avantage de prendre ses valeurs sur (-1;1). Initialement introduite pour modéliser les corrélations entre des variables continues, son extension au cas discret implique certaines restrictions. En effet, la nouvelle mesure pourrait être interprétée comme la corrélation entre les variables aléatoires continues dont la discrétisation constitue nos observations discrètes non négatives. Deux méthodes d'estimation des modèles augmentés en zéro seront présentées dans les contextes fréquentiste et bayésien basées respectivement sur le maximum de vraisemblance et l'intégration de Gauss-Hermite. Enfin, une étude de simulation permet de montrer la robustesse et les limites de notre approche. / This thesis presents some estimation methods and algorithms to analyse count data in particular and discrete data in general. It is also part of an NSERC strategic project, named CC-Bio, which aims to assess the impact of climate change on the distribution of plant and animal species in Québec. After a brief introduction to the concepts and definitions of biogeography and those relative to the generalized linear mixed models in chapters 1 and 2 respectively, my thesis will focus on three major and new ideas. First, we introduce in chapter 3 a new form of distribution whose components have marginal distribution Poisson or Skellam. This new specification allows to incorporate relevant information about the nature of the correlations between all the components. In addition, we present some properties of this probability distribution function. Unlike the multivariate Poisson distribution initially introduced, this generalization enables to handle both positive and negative correlations. A simulation study illustrates the estimation in the two-dimensional case. The results obtained by Bayesian methods via Monte Carlo Markov chain (MCMC) suggest a fairly low relative bias of less than 5% for the regression coefficients of the mean. However, those of the covariance term seem a bit more volatile. Later, the chapter 4 presents an extension of the multivariate Poisson regression with random effects having a gamma density. Indeed, aware that the abundance data of species have a high dispersion, which would make misleading estimators and standard deviations, we introduce an approach based on integration by Monte Carlo sampling. The approach remains the same as in the previous chapter. Indeed, the objective is to simulate independent latent variables to transform the multivariate problem estimation in many generalized linear mixed models (GLMM) with conventional gamma random effects density. While the assumption of knowledge a priori dispersion parameters seems too strong and not realistic, a sensitivity analysis based on a measure of goodness of fit is used to demonstrate the robustness of the method. Finally, in the last chapter, we focus on the definition and construction of a measure of concordance or a correlation measure for some zeros augmented count data with Gaussian copula models. In contrast to Kendall's tau whose values lie in an interval whose bounds depend on the frequency of ties observations, this measure has the advantage of taking its values on the interval (-1, 1). Originally introduced to model the correlations between continuous variables, its extension to the discrete case implies certain restrictions and its values are no longer in the entire interval (-1,1) but only on a subset. Indeed, the new measure could be interpreted as the correlation between continuous random variables before being transformed to discrete variables considered as our discrete non negative observations. Two methods of estimation based on integration via Gaussian quadrature and maximum likelihood are presented. Some simulation studies show the robustness and the limits of our approach. modèle bayésien bayesian model données discrètes discrete data multidimensionnelles multivariate Poisson Poisson Skellam Skellam
5	Modélisation bayésienne des changements aux niches écologiques causés par le réchauffement climatique Akpoué, Blache Paul 05 1900 (has links) Cette thèse présente des méthodes de traitement de données de comptage en particulier et des données discrètes en général. Il s'inscrit dans le cadre d'un projet stratégique du CRNSG, nommé CC-Bio, dont l'objectif est d'évaluer l'impact des changements climatiques sur la répartition des espèces animales et végétales. Après une brève introduction aux notions de biogéographie et aux modèles linéaires mixtes généralisés aux chapitres 1 et 2 respectivement, ma thèse s'articulera autour de trois idées majeures. Premièrement, nous introduisons au chapitre 3 une nouvelle forme de distribution dont les composantes ont pour distributions marginales des lois de Poisson ou des lois de Skellam. Cette nouvelle spécification permet d'incorporer de l'information pertinente sur la nature des corrélations entre toutes les composantes. De plus, nous présentons certaines propriétés de ladite distribution. Contrairement à la distribution multidimensionnelle de Poisson qu'elle généralise, celle-ci permet de traiter les variables avec des corrélations positives et/ou négatives. Une simulation permet d'illustrer les méthodes d'estimation dans le cas bidimensionnel. Les résultats obtenus par les méthodes bayésiennes par les chaînes de Markov par Monte Carlo (CMMC) indiquent un biais relatif assez faible de moins de 5% pour les coefficients de régression des moyennes contrairement à ceux du terme de covariance qui semblent un peu plus volatils. Deuxièmement, le chapitre 4 présente une extension de la régression multidimensionnelle de Poisson avec des effets aléatoires ayant une densité gamma. En effet, conscients du fait que les données d'abondance des espèces présentent une forte dispersion, ce qui rendrait fallacieux les estimateurs et écarts types obtenus, nous privilégions une approche basée sur l'intégration par Monte Carlo grâce à l'échantillonnage préférentiel. L'approche demeure la même qu'au chapitre précédent, c'est-à-dire que l'idée est de simuler des variables latentes indépendantes et de se retrouver dans le cadre d'un modèle linéaire mixte généralisé (GLMM) conventionnel avec des effets aléatoires de densité gamma. Même si l'hypothèse d'une connaissance a priori des paramètres de dispersion semble trop forte, une analyse de sensibilité basée sur la qualité de l'ajustement permet de démontrer la robustesse de notre méthode. Troisièmement, dans le dernier chapitre, nous nous intéressons à la définition et à la construction d'une mesure de concordance donc de corrélation pour les données augmentées en zéro par la modélisation de copules gaussiennes. Contrairement au tau de Kendall dont les valeurs se situent dans un intervalle dont les bornes varient selon la fréquence d'observations d'égalité entre les paires, cette mesure a pour avantage de prendre ses valeurs sur (-1;1). Initialement introduite pour modéliser les corrélations entre des variables continues, son extension au cas discret implique certaines restrictions. En effet, la nouvelle mesure pourrait être interprétée comme la corrélation entre les variables aléatoires continues dont la discrétisation constitue nos observations discrètes non négatives. Deux méthodes d'estimation des modèles augmentés en zéro seront présentées dans les contextes fréquentiste et bayésien basées respectivement sur le maximum de vraisemblance et l'intégration de Gauss-Hermite. Enfin, une étude de simulation permet de montrer la robustesse et les limites de notre approche. / This thesis presents some estimation methods and algorithms to analyse count data in particular and discrete data in general. It is also part of an NSERC strategic project, named CC-Bio, which aims to assess the impact of climate change on the distribution of plant and animal species in Québec. After a brief introduction to the concepts and definitions of biogeography and those relative to the generalized linear mixed models in chapters 1 and 2 respectively, my thesis will focus on three major and new ideas. First, we introduce in chapter 3 a new form of distribution whose components have marginal distribution Poisson or Skellam. This new specification allows to incorporate relevant information about the nature of the correlations between all the components. In addition, we present some properties of this probability distribution function. Unlike the multivariate Poisson distribution initially introduced, this generalization enables to handle both positive and negative correlations. A simulation study illustrates the estimation in the two-dimensional case. The results obtained by Bayesian methods via Monte Carlo Markov chain (MCMC) suggest a fairly low relative bias of less than 5% for the regression coefficients of the mean. However, those of the covariance term seem a bit more volatile. Later, the chapter 4 presents an extension of the multivariate Poisson regression with random effects having a gamma density. Indeed, aware that the abundance data of species have a high dispersion, which would make misleading estimators and standard deviations, we introduce an approach based on integration by Monte Carlo sampling. The approach remains the same as in the previous chapter. Indeed, the objective is to simulate independent latent variables to transform the multivariate problem estimation in many generalized linear mixed models (GLMM) with conventional gamma random effects density. While the assumption of knowledge a priori dispersion parameters seems too strong and not realistic, a sensitivity analysis based on a measure of goodness of fit is used to demonstrate the robustness of the method. Finally, in the last chapter, we focus on the definition and construction of a measure of concordance or a correlation measure for some zeros augmented count data with Gaussian copula models. In contrast to Kendall's tau whose values lie in an interval whose bounds depend on the frequency of ties observations, this measure has the advantage of taking its values on the interval (-1, 1). Originally introduced to model the correlations between continuous variables, its extension to the discrete case implies certain restrictions and its values are no longer in the entire interval (-1,1) but only on a subset. Indeed, the new measure could be interpreted as the correlation between continuous random variables before being transformed to discrete variables considered as our discrete non negative observations. Two methods of estimation based on integration via Gaussian quadrature and maximum likelihood are presented. Some simulation studies show the robustness and the limits of our approach. modèle bayésien bayesian model données discrètes discrete data multidimensionnelles multivariate Poisson Poisson Skellam Skellam
6	Métodos estatísticos aplicados ao teste de Salmonella/microssoma: modelos, seleção e suas implicações / Statistical methods applied for Salmonella/microsome test data: models, selection and their entailments Butturi-Gomes, Davi 03 December 2015 (has links) O teste de Salmonella/microssoma é um ensaio biológico amplamente utilizado para avaliar o potencial mutagênico de substâncias que podem colocar em risco a saúde humana e a qualidade ambiental. A variável resposta é constituída pela contagem do número de colônias revertentes em cada placa, entretanto geralmente há dois efeitos confundidos, o de toxicidade e o de mutagenicidade. Alguns modelos foram propostos para a análise dos dados desses experimentos, que nem sempre apresentam bons ajustes e não consideram explicitamente interações. Há, ainda, poucas plataformas computacionais disponíveis que integram todas essas propostas e forneçam critérios para a seleção adequada de um modelo. Além disso, geralmente é difícil comparar os efeitos de diferentes substâncias sobre as várias linhagens da bactéria, então medidas com interpretação biológica direta são necessárias. Neste trabalho, foram investigadas as propriedades dos preditores dos modelos tradicionais, bem como o comportamento das distribuições amostrais dos estimadores dos parâmetros desses modelos, na presença de diversos níveis de superdispersão. Também, foram realizados experimentos com as linhagens TA98 e TA100 da bactéria, expostas aos inseticidas, metabolizados e não-metabolizados, Fipronil e Tiametoxam, dois agroquímicos bastante utilizados no Brasil. Aos dados desses experimentos foram ajustados diversos modelos, tanto aqueles tradicionalmente utilizados, quanto novos modelos, alguns baseados na regressão de Skellam e outros com interações explícitas. Para tal, foi obtida uma nova classe de modelos chamada de modelos não-lineares vetoriais generalizados e foi desenvolvido um pacote computacional em linguagem R, intitulado \"ames\", para o ajuste, diagnóstico e seleção de modelos. Por fim, foram propostas medidas de interesse biológico, baseadas nos modelos selecionados, para avaliação de risco e do comprometimento do material genético e intervalos de confiança bootstrap paramétrico foram obtidos. Dentre os modelos tradicionais, aqueles cujas distribuições amostrais dos estimadores possuem melhor aproximação normal foram os de Bernstein, Breslow e Myers. Estes resultados forneceram um critério prático para a seleção de modelos, particularmente nas situações em que as medidas de AIC e de bondade de ajuste, os testes de razão de verossimilhanças e a análise de resíduos ou são pouco informativos ou simplesmente não podem ser aplicados. A partir dos modelos selecionados, pode-se concluir que a interação do fator de metabolização é significativa para a linhagem TA98 exposta ao Fipronil, tanto com relação aos efeitos tóxicos quanto aos efeitos mutagênicos; que o mecanismo de ação do Tiametoxam sobre a linhagem TA98 é completamente diferente quando o produto está metabolizado; e que, para a linhagem TA100, não houve efeito de metabolização considerando ambos os agroquímicos. Baseando-se nas medidas propostas, pode-se concluir que o Tiametoxam oferece os maiores riscos de contaminação residual, ainda que o Fipronil apresente os maiores índices de mutagenicidade. / The Salmonella/microsome test is a widely accepted biological assay used to evaluate the mutagenic potential of substances, which can compromise human health and environment quality. The response variable in such experiments is typically the total number of reverts per plate, which, in turn, is the result of the confounded effects of mutagenicity and toxicity. Despite of some statistical models have already been established in the literature, they do not always fit well and neither explicitly consider interaction terms. Besides, there is just a number of available software able to handle these different approaches, usually lacking of global performance and model selection criteria. Also, it is often a hard task to compare the effects of different chemicals over the several available strains to perform the assay, and, thus, direct measures of biological implications are required. In this work, the properties of the predictors in each traditional model were investigated, as well as the behavior of the sampling distributions of the parameter estimators of these models, in different levels of overdispersion. Also, experiments using TA98 and TA100 strains were perfomed, by exposition to two insecticides, namely Fipronil and Thiamethoxam, currently used in Brazil, each of them prior and after to a metabolization processes. Then, the traditional models, empirical regression models based on the Skellam distribution and also compound mechanistic-empirical models with explicit interaction terms were fitted to the data. In order to use a single fitting framework, a new class of models was presented, namely the vector generalized nonlinear models, and a R language package, entitled \"ames\", was developed for fitting, diagnosing and selection of models. Finally, some measures of biological interest were approached based on the selected models for the data, in the contexts of risk evaluation and of DNA damage cautioning. Confidence intervals for such measures were provided using bootstrap percentiles. Among the traditional models, the ones from Bernstein, Breslow and Myers were those whose sampling distributions presented the best normal approximations. These results provided a practical criterion for model selection, particularly in situations where measures as AIC and goodness of fit, likelihood ratio tests, and residual analysis are non informative or simply cannot be applied. From the final selected models, it was inferred that the interactions between the metabolization factor is significative for TA98 strain exposed to Fipronil, regarding both, mutagenic and toxic effects; that the dynamics between mutagenicity and toxicity are different when Thiamethoxam is metabolized compared to when it is not; and that there was no evidence to consider metabolization factor interactions for the TA100 strain data exposed to neither of the insecticides. By appling the referred measures of biological interest, it was concluded that the use of Thiamethoxam provides greater residual contamination risks and that Fipronil causes higher mutagenicity indices. Ames test Distribuição de Skellam Funções em R Generalized nonlinear models Modelos lineares generalizados vetoriais Modelos não-lineares generalizados R functions Skellam distribution Teste de Ames Vector generalized linear models
7	Métodos estatísticos aplicados ao teste de Salmonella/microssoma: modelos, seleção e suas implicações / Statistical methods applied for Salmonella/microsome test data: models, selection and their entailments Davi Butturi-Gomes 03 December 2015 (has links) O teste de Salmonella/microssoma é um ensaio biológico amplamente utilizado para avaliar o potencial mutagênico de substâncias que podem colocar em risco a saúde humana e a qualidade ambiental. A variável resposta é constituída pela contagem do número de colônias revertentes em cada placa, entretanto geralmente há dois efeitos confundidos, o de toxicidade e o de mutagenicidade. Alguns modelos foram propostos para a análise dos dados desses experimentos, que nem sempre apresentam bons ajustes e não consideram explicitamente interações. Há, ainda, poucas plataformas computacionais disponíveis que integram todas essas propostas e forneçam critérios para a seleção adequada de um modelo. Além disso, geralmente é difícil comparar os efeitos de diferentes substâncias sobre as várias linhagens da bactéria, então medidas com interpretação biológica direta são necessárias. Neste trabalho, foram investigadas as propriedades dos preditores dos modelos tradicionais, bem como o comportamento das distribuições amostrais dos estimadores dos parâmetros desses modelos, na presença de diversos níveis de superdispersão. Também, foram realizados experimentos com as linhagens TA98 e TA100 da bactéria, expostas aos inseticidas, metabolizados e não-metabolizados, Fipronil e Tiametoxam, dois agroquímicos bastante utilizados no Brasil. Aos dados desses experimentos foram ajustados diversos modelos, tanto aqueles tradicionalmente utilizados, quanto novos modelos, alguns baseados na regressão de Skellam e outros com interações explícitas. Para tal, foi obtida uma nova classe de modelos chamada de modelos não-lineares vetoriais generalizados e foi desenvolvido um pacote computacional em linguagem R, intitulado \"ames\", para o ajuste, diagnóstico e seleção de modelos. Por fim, foram propostas medidas de interesse biológico, baseadas nos modelos selecionados, para avaliação de risco e do comprometimento do material genético e intervalos de confiança bootstrap paramétrico foram obtidos. Dentre os modelos tradicionais, aqueles cujas distribuições amostrais dos estimadores possuem melhor aproximação normal foram os de Bernstein, Breslow e Myers. Estes resultados forneceram um critério prático para a seleção de modelos, particularmente nas situações em que as medidas de AIC e de bondade de ajuste, os testes de razão de verossimilhanças e a análise de resíduos ou são pouco informativos ou simplesmente não podem ser aplicados. A partir dos modelos selecionados, pode-se concluir que a interação do fator de metabolização é significativa para a linhagem TA98 exposta ao Fipronil, tanto com relação aos efeitos tóxicos quanto aos efeitos mutagênicos; que o mecanismo de ação do Tiametoxam sobre a linhagem TA98 é completamente diferente quando o produto está metabolizado; e que, para a linhagem TA100, não houve efeito de metabolização considerando ambos os agroquímicos. Baseando-se nas medidas propostas, pode-se concluir que o Tiametoxam oferece os maiores riscos de contaminação residual, ainda que o Fipronil apresente os maiores índices de mutagenicidade. / The Salmonella/microsome test is a widely accepted biological assay used to evaluate the mutagenic potential of substances, which can compromise human health and environment quality. The response variable in such experiments is typically the total number of reverts per plate, which, in turn, is the result of the confounded effects of mutagenicity and toxicity. Despite of some statistical models have already been established in the literature, they do not always fit well and neither explicitly consider interaction terms. Besides, there is just a number of available software able to handle these different approaches, usually lacking of global performance and model selection criteria. Also, it is often a hard task to compare the effects of different chemicals over the several available strains to perform the assay, and, thus, direct measures of biological implications are required. In this work, the properties of the predictors in each traditional model were investigated, as well as the behavior of the sampling distributions of the parameter estimators of these models, in different levels of overdispersion. Also, experiments using TA98 and TA100 strains were perfomed, by exposition to two insecticides, namely Fipronil and Thiamethoxam, currently used in Brazil, each of them prior and after to a metabolization processes. Then, the traditional models, empirical regression models based on the Skellam distribution and also compound mechanistic-empirical models with explicit interaction terms were fitted to the data. In order to use a single fitting framework, a new class of models was presented, namely the vector generalized nonlinear models, and a R language package, entitled \"ames\", was developed for fitting, diagnosing and selection of models. Finally, some measures of biological interest were approached based on the selected models for the data, in the contexts of risk evaluation and of DNA damage cautioning. Confidence intervals for such measures were provided using bootstrap percentiles. Among the traditional models, the ones from Bernstein, Breslow and Myers were those whose sampling distributions presented the best normal approximations. These results provided a practical criterion for model selection, particularly in situations where measures as AIC and goodness of fit, likelihood ratio tests, and residual analysis are non informative or simply cannot be applied. From the final selected models, it was inferred that the interactions between the metabolization factor is significative for TA98 strain exposed to Fipronil, regarding both, mutagenic and toxic effects; that the dynamics between mutagenicity and toxicity are different when Thiamethoxam is metabolized compared to when it is not; and that there was no evidence to consider metabolization factor interactions for the TA100 strain data exposed to neither of the insecticides. By appling the referred measures of biological interest, it was concluded that the use of Thiamethoxam provides greater residual contamination risks and that Fipronil causes higher mutagenicity indices. Distribuição de Skellam Funções em R Modelos lineares generalizados vetoriais Modelos não-lineares generalizados Teste de Ames Ames test Generalized nonlinear models R functions Skellam distribution Vector generalized linear models
8	Statistical Methods for Functional Genomics Studies Using Observational Data Lu, Rong 15 December 2016 (has links) No description available. Biostatistics folded Skellam mixture AEI Sobol sensitivity indices GLM variable selection variable ranking global sensitivity analysis co-expression network
9	Modelling animal populations Brännström, Åke January 2004 (has links) This thesis consists of four papers, three papers about modelling animal populations and one paper about an area integral estimate for solutions of partial differential equations on non-smooth domains. The papers are: I. Å. Brännström, Single species population models from first principles. II. Å. Brännström and D. J. T. Sumpter, Stochastic analogues of deterministic single species population models. III. Å. Brännström and D. J. T. Sumpter, Coupled map lattice approximations for spatially explicit individual-based models of ecology. IV. Å. Brännström, An area integral estimate for higher order parabolic equations. In the first paper we derive deterministic discrete single species population models with first order feedback, such as the Hassell and Beverton-Holt model, from first principles. The derivations build on the site based method of Sumpter & Broomhead (2001) and Johansson & Sumpter (2003). A three parameter generalisation of the Beverton-Holtmodel is also derived, and one of the parameters is shown to correspond directly to the underlying distribution of individuals. The second paper is about constructing stochastic population models that incorporate a given deterministic skeleton. Using the Ricker model as an example, we construct several stochastic analogues and fit them to data using the method of maximum likelihood. The results show that an accurate stochastic population model is most important when the dynamics are periodic or chaotic, and that the two most common ways of constructing stochastic analogues, using additive normally distributed noise or multiplicative lognormally distributed noise, give models that fit the data well. The latter is also motivated on theoretical grounds. In the third paper we approximate a spatially explicit individual-based model with a stochastic coupledmap lattice. The approximation effectively disentangles the deterministic and stochastic components of the model. Based on this approximation we argue that the stable population dynamics seen for short dispersal ranges is a consequence of increased stochasticity from local interactions and dispersal. Finally, the fourth paper contains a proof that for solutions of higher order real homogeneous constant coefficient parabolic operators on Lipschitz cylinders, the area integral dominates the maximal function in the L2-norm. population model stochastic population model population dynamics discrete time model Beverton-Holt model Skellam model Hassell model Ricker model first principles coupled map lattice CML area integral square function

Search results