Global ETD Search

1	A statistical model for locating regulatory regions in novel DNA sequences Byng, Martyn Charles January 2001 (has links) No description available. 519.5
2	Modelo oculto de Markov para imputação de genótipos de marcadores moleculares: Uma aplicação no mapeamento de QTL utilizando a abordagem bayesiana / Hidden Markov model for imputation of genotypes of molecular markers: An application in QTL mapping using Bayesian approach Medeiros, Elias Silva de 28 August 2014 (has links) Muitas são as características quantitativas que são, significativamente, influenciadas por fatores genéticos, em geral, existem vários genes que colaboram para a variação de uma ou mais características quantitativas. As informações ausentes a respeito dos genótipos nos marcadores moleculares é um problema comum em estudo de mapeamento genético e, por conseguinte, no mapeamento dos locus que controlam estas características fenotípicas (QTL). Os dados que não foram observados ocorrem, principalmente, devido a erros de genotipagem e de marcadores não informativos. Para solucionar este problema foi utilizado o método do modelo oculto de Markov para inferir estes dados. Os métodos de acurácias evidenciaram o sucesso da aplicação desta técnica de imputa- ção. Uma vez imputado, na inferência bayesiana estes dados não serão mais tratados como uma variável aleatória resultando assim, numa redução no espaço paramétrico do modelo. Outra grande dificuldade no mapeamento de QTL se deve ao fato de que não se conhece ao certo a quantidade destes que influenciam uma dada característica, fazendo com que surjam diversos problemas, um deles é a dimensão do espaço paramétrico e, consequentemente, a obtenção da amostra a posteriori. Assim, com o objetivo de contornar este problema foi proposta a utilização do método Monte Carlo via cadeia de Markov com Saltos Reversíveis, uma vez que este permite flutuar, entre cada iteração, modelos com diferentes quantidades de parâmetros. A utilização da abordagem bayesiana permitiu detectar cinco QTL para a característica estudada. Todas as análises foram implementadas no programa estatístico R. / There are many quantitative characteristics which are significantly influenced by genetic factors, in general, there are several genes that contribute to the variation of one or more quantitative trait. The missing information about the genotypes in molecular markers is a common problem in studying genetic mapping and therefore the mapping of loci that control these phenotypic traits (QTL). The data were not observed occur mainly due to errors in genotyping and uninformative markers. To solve this problem the method of occult Markov model to infer this information was used. Techniques accuracies demonstrated the successful application of this technique of imputation. Once allocated, in the Bayesian inference this data will no longer be treated as a random variable thus resulting in a reduction in the parameter space of the model. Another great difficulty in mapping QTL is due to the fact that no one knows exactly the amount of these which influence a given characteristic, so that several problems arise, one of them is dimension of the parameter space and, consequently, obtaining the sample a posterior. Thus, in order to solve this problem using the method via Monte Carlo Markov chain Reversible Jump was proposed, since this allows fluctuate between each iteration, models with different numbers of parameters. The use of the Bayesian approach allowed five QTL detected for the studied trait. All analyzes were implemented in the statistical software R. Imputação de genótipos Imputation of genotypes Mapeamento de QTL MCMC com Saltos Reversíveis QTL mapping Reversible jump MCMC
3	Bayesian surface smoothing under anisotropy Chakravarty, Subhashish 01 January 2007 (has links) Bayesian surface smoothing using splines usually proceeds by choosing the smoothness parameter through the use of data driven methods like generalized cross validation. In this methodology, knots of the splines are assumed to lie at the data locations. When anisotropy is present in the data, modeling is done via parametric functions. In the present thesis, we have proposed a non-parametric approach to Bayesian surface smoothing in the presence of anisotropy. We use eigenfunctions generated by thin-plate splines as our basis functions. Using eigenfunctions does away with having to place knots arbitrarily, as is done customarily. The smoothing parameter, the anisotropy matrix, and other parameters are simultaneously updated by a Reversible Jump Markov Chain Monte Carlo (RJMCMC) sampler. Unique in our implementation is model selection, which is again done concurrently with the parameter updates. Since the posterior distribution of the coefficients of the basis functions for any given model order is available in closed form, we are able to simplify the sampling algorithm in the model selection step. This also helps us in isolating the parameters which influence the model selection step. We investigate the relationship between the number of basis functions used in the model and the smoothness parameter and find that there is a delicate balance which exists between the two. Higher values of the smoothness parameter correspond to more number of basis functions being selected. Use of a non-parametric approach to Bayesian surface smoothing provides for more modeling flexibility. We are not constrained by the shape defined by a parametric shape of the covariance as used by earlier methods. A Bayesian approach also allows us to include the results obtained from previous analysis of the same data, if any, as prior information. It also allows us to evaluate pointwise estimates of variability of the fitted surface. We believe that our research also poses many questions for future research. Statistics and Probability
4	Bayesian wavelet approaches for parameter estimation and change point detection in long memory processes Ko, Kyungduk 01 November 2005 (has links) The main goal of this research is to estimate the model parameters and to detect multiple change points in the long memory parameter of Gaussian ARFIMA(p, d, q) processes. Our approach is Bayesian and inference is done on wavelet domain. Long memory processes have been widely used in many scientiﬁc ﬁelds such as economics, ﬁnance and computer science. Wavelets have a strong connection with these processes. The ability of wavelets to simultaneously localize a process in time and scale domain results in representing many dense variance-covariance matrices of the process in a sparse form. A wavelet-based Bayesian estimation procedure for the parameters of Gaussian ARFIMA(p, d, q) process is proposed. This entails calculating the exact variance-covariance matrix of given ARFIMA(p, d, q) process and transforming them into wavelet domains using two dimensional discrete wavelet transform (DWT2). Metropolis algorithm is used for sampling the model parameters from the posterior distributions. Simulations with diﬀerent values of the parameters and of the sample size are performed. A real data application to the U.S. GNP data is also reported. Detection and estimation of multiple change points in the long memory parameter is also investigated. The reversible jump MCMC is used for posterior inference. Performances are evaluated on simulated data and on the Nile River dataset. Long Memory Process ARFIMA Models Discrete Wavelet Transform Bayesian Inference Reversible Jump MCMC
5	Bayesian wavelet approaches for parameter estimation and change point detection in long memory processes Ko, Kyungduk 01 November 2005 (has links) The main goal of this research is to estimate the model parameters and to detect multiple change points in the long memory parameter of Gaussian ARFIMA(p, d, q) processes. Our approach is Bayesian and inference is done on wavelet domain. Long memory processes have been widely used in many scientiﬁc ﬁelds such as economics, ﬁnance and computer science. Wavelets have a strong connection with these processes. The ability of wavelets to simultaneously localize a process in time and scale domain results in representing many dense variance-covariance matrices of the process in a sparse form. A wavelet-based Bayesian estimation procedure for the parameters of Gaussian ARFIMA(p, d, q) process is proposed. This entails calculating the exact variance-covariance matrix of given ARFIMA(p, d, q) process and transforming them into wavelet domains using two dimensional discrete wavelet transform (DWT2). Metropolis algorithm is used for sampling the model parameters from the posterior distributions. Simulations with diﬀerent values of the parameters and of the sample size are performed. A real data application to the U.S. GNP data is also reported. Detection and estimation of multiple change points in the long memory parameter is also investigated. The reversible jump MCMC is used for posterior inference. Performances are evaluated on simulated data and on the Nile River dataset. Long Memory Process ARFIMA Models Discrete Wavelet Transform Bayesian Inference Reversible Jump MCMC
6	Modelo oculto de Markov para imputação de genótipos de marcadores moleculares: Uma aplicação no mapeamento de QTL utilizando a abordagem bayesiana / Hidden Markov model for imputation of genotypes of molecular markers: An application in QTL mapping using Bayesian approach Elias Silva de Medeiros 28 August 2014 (has links) Muitas são as características quantitativas que são, significativamente, influenciadas por fatores genéticos, em geral, existem vários genes que colaboram para a variação de uma ou mais características quantitativas. As informações ausentes a respeito dos genótipos nos marcadores moleculares é um problema comum em estudo de mapeamento genético e, por conseguinte, no mapeamento dos locus que controlam estas características fenotípicas (QTL). Os dados que não foram observados ocorrem, principalmente, devido a erros de genotipagem e de marcadores não informativos. Para solucionar este problema foi utilizado o método do modelo oculto de Markov para inferir estes dados. Os métodos de acurácias evidenciaram o sucesso da aplicação desta técnica de imputa- ção. Uma vez imputado, na inferência bayesiana estes dados não serão mais tratados como uma variável aleatória resultando assim, numa redução no espaço paramétrico do modelo. Outra grande dificuldade no mapeamento de QTL se deve ao fato de que não se conhece ao certo a quantidade destes que influenciam uma dada característica, fazendo com que surjam diversos problemas, um deles é a dimensão do espaço paramétrico e, consequentemente, a obtenção da amostra a posteriori. Assim, com o objetivo de contornar este problema foi proposta a utilização do método Monte Carlo via cadeia de Markov com Saltos Reversíveis, uma vez que este permite flutuar, entre cada iteração, modelos com diferentes quantidades de parâmetros. A utilização da abordagem bayesiana permitiu detectar cinco QTL para a característica estudada. Todas as análises foram implementadas no programa estatístico R. / There are many quantitative characteristics which are significantly influenced by genetic factors, in general, there are several genes that contribute to the variation of one or more quantitative trait. The missing information about the genotypes in molecular markers is a common problem in studying genetic mapping and therefore the mapping of loci that control these phenotypic traits (QTL). The data were not observed occur mainly due to errors in genotyping and uninformative markers. To solve this problem the method of occult Markov model to infer this information was used. Techniques accuracies demonstrated the successful application of this technique of imputation. Once allocated, in the Bayesian inference this data will no longer be treated as a random variable thus resulting in a reduction in the parameter space of the model. Another great difficulty in mapping QTL is due to the fact that no one knows exactly the amount of these which influence a given characteristic, so that several problems arise, one of them is dimension of the parameter space and, consequently, obtaining the sample a posterior. Thus, in order to solve this problem using the method via Monte Carlo Markov chain Reversible Jump was proposed, since this allows fluctuate between each iteration, models with different numbers of parameters. The use of the Bayesian approach allowed five QTL detected for the studied trait. All analyzes were implemented in the statistical software R. Imputação de genótipos Mapeamento de QTL MCMC com Saltos Reversíveis Imputation of genotypes QTL mapping Reversible jump MCMC
7	A Non-parametric Bayesian Method for Hierarchical Clustering of Longitudinal Data Ren, Yan 23 October 2012 (has links) No description available. Statistics cluster analysis Bayesian longitudinal data Dirichlet process mixture (DPM) model reversible jump MCMC Gibbs
8	Mapeamento de QTLs utilizando as abordagens Clássica e Bayesiana / Mapping QTLs: Classical and Bayesian approaches Toledo, Elisabeth Regina de 02 October 2006 (has links) A produção de grãos e outros caracteres de importância econômica para a cultura do milho, tais como a altura da planta, o comprimento e o diâmetro da espiga, apresentam herança poligênica, o que dificulta a obtenção de informações sobre as bases genéticas envolvidas na variação desses caracteres. Associações entre marcadores e QTLs foram analisadas através dos métodos de mapeamento por intervalo composto (CIM) e mapeamento por intervalo Bayesiano (BIM). A partir de um conjunto de dados de produção de grãos, referentes à avaliação de 256 progênies de milho genotipadas para 139 marcadores moleculares codominantes, verificou-se que as metodologias apresentadas permitiram classificar marcas associadas a QTLs. Através do procedimento CIM, associações entre marcadores e QTLs foram consideradas significativas quando o valor da estatística de razão de verossimilhança (LR) ao longo do cromossomo atingiu o valor máximo dentre os que ultrapassaram o limite crítico LR = 11; 5 no intervalo considerado. Dez QTLs foram mapeados distribuídos em três cromossomos. Juntos, explicaram 19,86% da variância genética. Os tipos de interação alélica predominantes foram de dominância parcial (quatro QTLs) e dominância completa (três QTLs). O grau médio de dominância calculado foi de 1,12, indicando grau médio de dominância completa. Grande parte dos alelos favoráveis ao caráter foram provenientes da linhagem parental L0202D, que apresentou mais elevada produção de grãos. Adotando-se a abordagem Bayesiana, foram implementados métodos de amostragem através de cadeias de Markov (MCMC), para obtenção de uma amostra da distribuição a posteriori dos parâmetros de interesse, incorporando as crenças e incertezas a priori. Resumos sobre as localizações dos QTLs e seus efeitos aditivo e de dominância foram obtidos. Métodos MCMC com saltos reversíveis (RJMCMC) foram utilizados para a análise Bayesiana e Fator calculado de Bayes para estimar o número de QTLs. Através do método BIM associações entre marcadores e QTLs foram consideradas significativas em quatro cromossomos, com um total de cinco QTLs mapeados. Juntos, esses QTLs explicaram 13,06% da variância genética. A maior parte dos alelos favoráveis ao caráter também foram provenientes da linhagem parental L02-02D. / Grain yield and other important economic traits in maize, such as plant heigth, stalk length, and stalk diameter, exhibit polygenic inheritance, making dificult information achievement about the genetic bases related to the variation of these traits. The number and sites of (QTLs) loci that control grain yield in maize have been estimated. Associations between markers and QTLs were undertaken by composite interval mapping (CIM) and Bayesian interval mapping (BIM). Based on a set of grain yield data, obtained from the evaluation of 256 maize progenies genotyped for 139 codominant molecular markers, the presented methodologies allowed classification of markers associated to QTLs.Through composite interval mapping were significant when value of likelihood ratio (LR) throughout the chromosome surpassed LR = 11; 5. Significant associations between markers and QTLs were obtained in three chromosomes, ten QTLs has been mapped, which explained 19; 86% of genetic variation. Predominant genetic action for mapped QTLs was partial dominance and (four QTLs) complete dominance (tree QTLs). Average dominance amounted to 1,12 and confirmed complete dominance for grain yield. Most alleles that contributed positively in trait came from parental strain L0202D. The latter had the highest yield rate. Adopting a Bayesian approach to inference, usually implemented via Markov chain Monte Carlo (MCMC). The output of a Bayesian analysis is a posterior distribution on the parameters, fully incorporating prior beliefs and parameter uncertainty. Reversible Jump MCMC (RJMCMC) is used in this work. Bayes Factor is used to estimate the number of QTL. Through Bayesian interval, significant associations between markers and QTLs were obtained in four chromosomes and five QTLs has been mapped, which explained 13; 06% of genetic variation. Most alleles that contributed positively in trait came from parental strain L02-02D. The latter had the highest yield rate. Applied Statistics Bayesian interval mapping Composite interval mapping Estatística aplicada Genetic mapping Grains Grãos Likelihood Mapeamento genético Marcador molecular Milho Quantitative trait loci Reversible jump MCMC Verossimilhança Yeld
9	Stochastic process analysis for Genomics and Dynamic Bayesian Networks inference. Lebre, Sophie 14 September 2007 (has links) (PDF) This thesis is dedicated to the development of statistical and computational methods for the analysis of DNA sequences and gene expression time series.<br /><br />First we study a parsimonious Markov model called Mixture Transition Distribution (MTD) model which is a mixture of Markovian transitions. The overly high number of constraints on the parameters of this model hampers the formulation of an analytical expression of the Maximum Likelihood Estimate (MLE). We propose to approach the MLE thanks to an EM algorithm. After comparing the performance of this algorithm to results from the litterature, we use it to evaluate the relevance of MTD modeling for bacteria DNA coding sequences in comparison with standard Markovian modeling.<br /><br />Then we propose two different approaches for genetic regulation network recovering. We model those genetic networks with Dynamic Bayesian Networks (DBNs) whose edges describe the dependency relationships between time-delayed genes expression. The aim is to estimate the topology of this graph despite the overly low number of repeated measurements compared with the number of observed genes. <br /><br />To face this problem of dimension, we first assume that the dependency relationships are homogeneous, that is the graph topology is constant across time. Then we propose to approximate this graph by considering partial order dependencies. The concept of partial order dependence graphs, already introduced for static and non directed graphs, is adapted and characterized for DBNs using the theory of graphical models. From these results, we develop a deterministic procedure for DBNs inference. <br /><br />Finally, we relax the homogeneity assumption by considering the succession of several homogeneous phases. We consider a multiple changepoint<br />regression model. Each changepoint indicates a change in the regression model parameters, which corresponds to the way an expression level depends on the others. Using reversible jump MCMC methods, we develop a stochastic algorithm which allows to simultaneously infer the changepoints location and the structure of the network within the phases delimited by the changepoints. <br /><br />Validation of those two approaches is carried out on both simulated and real data analysis. [MATH] Mathematics Time series Gene expression Genetic networks Network inference Dynamic Bayesian Networks DBN Changepoints detection Reversible jump MCMC Partial order dependence Mixture Transition Distribution MTD EM algorithm
10	Mapeamento de QTLs utilizando as abordagens Clássica e Bayesiana / Mapping QTLs: Classical and Bayesian approaches Elisabeth Regina de Toledo 02 October 2006 (has links) A produção de grãos e outros caracteres de importância econômica para a cultura do milho, tais como a altura da planta, o comprimento e o diâmetro da espiga, apresentam herança poligênica, o que dificulta a obtenção de informações sobre as bases genéticas envolvidas na variação desses caracteres. Associações entre marcadores e QTLs foram analisadas através dos métodos de mapeamento por intervalo composto (CIM) e mapeamento por intervalo Bayesiano (BIM). A partir de um conjunto de dados de produção de grãos, referentes à avaliação de 256 progênies de milho genotipadas para 139 marcadores moleculares codominantes, verificou-se que as metodologias apresentadas permitiram classificar marcas associadas a QTLs. Através do procedimento CIM, associações entre marcadores e QTLs foram consideradas significativas quando o valor da estatística de razão de verossimilhança (LR) ao longo do cromossomo atingiu o valor máximo dentre os que ultrapassaram o limite crítico LR = 11; 5 no intervalo considerado. Dez QTLs foram mapeados distribuídos em três cromossomos. Juntos, explicaram 19,86% da variância genética. Os tipos de interação alélica predominantes foram de dominância parcial (quatro QTLs) e dominância completa (três QTLs). O grau médio de dominância calculado foi de 1,12, indicando grau médio de dominância completa. Grande parte dos alelos favoráveis ao caráter foram provenientes da linhagem parental L0202D, que apresentou mais elevada produção de grãos. Adotando-se a abordagem Bayesiana, foram implementados métodos de amostragem através de cadeias de Markov (MCMC), para obtenção de uma amostra da distribuição a posteriori dos parâmetros de interesse, incorporando as crenças e incertezas a priori. Resumos sobre as localizações dos QTLs e seus efeitos aditivo e de dominância foram obtidos. Métodos MCMC com saltos reversíveis (RJMCMC) foram utilizados para a análise Bayesiana e Fator calculado de Bayes para estimar o número de QTLs. Através do método BIM associações entre marcadores e QTLs foram consideradas significativas em quatro cromossomos, com um total de cinco QTLs mapeados. Juntos, esses QTLs explicaram 13,06% da variância genética. A maior parte dos alelos favoráveis ao caráter também foram provenientes da linhagem parental L02-02D. / Grain yield and other important economic traits in maize, such as plant heigth, stalk length, and stalk diameter, exhibit polygenic inheritance, making dificult information achievement about the genetic bases related to the variation of these traits. The number and sites of (QTLs) loci that control grain yield in maize have been estimated. Associations between markers and QTLs were undertaken by composite interval mapping (CIM) and Bayesian interval mapping (BIM). Based on a set of grain yield data, obtained from the evaluation of 256 maize progenies genotyped for 139 codominant molecular markers, the presented methodologies allowed classification of markers associated to QTLs.Through composite interval mapping were significant when value of likelihood ratio (LR) throughout the chromosome surpassed LR = 11; 5. Significant associations between markers and QTLs were obtained in three chromosomes, ten QTLs has been mapped, which explained 19; 86% of genetic variation. Predominant genetic action for mapped QTLs was partial dominance and (four QTLs) complete dominance (tree QTLs). Average dominance amounted to 1,12 and confirmed complete dominance for grain yield. Most alleles that contributed positively in trait came from parental strain L0202D. The latter had the highest yield rate. Adopting a Bayesian approach to inference, usually implemented via Markov chain Monte Carlo (MCMC). The output of a Bayesian analysis is a posterior distribution on the parameters, fully incorporating prior beliefs and parameter uncertainty. Reversible Jump MCMC (RJMCMC) is used in this work. Bayes Factor is used to estimate the number of QTL. Through Bayesian interval, significant associations between markers and QTLs were obtained in four chromosomes and five QTLs has been mapped, which explained 13; 06% of genetic variation. Most alleles that contributed positively in trait came from parental strain L02-02D. The latter had the highest yield rate. Estatística aplicada Grãos Mapeamento genético Marcador molecular Milho Verossimilhança Applied Statistics Bayesian interval mapping Composite interval mapping Genetic mapping Grains Likelihood Quantitative trait loci Reversible jump MCMC Yeld

Search results