• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 5
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 23
  • 23
  • 15
  • 11
  • 11
  • 10
  • 10
  • 10
  • 7
  • 6
  • 5
  • 4
  • 4
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Reversible Jump Markov Chain Monte Carlo

Neuhoff, Daniel 15 March 2016 (has links)
Die vier in der vorliegenden Dissertation enthaltenen Studien beschäftigen sich vorwiegend mit dem dynamischen Verhalten makroökonomischer Zeitreihen. Diese Dynamiken werden sowohl im Kontext eines einfachen DSGE Modells, als auch aus der Sichtweise reiner Zeitreihenmodelle untersucht. / The four studies of this thesis are concerned predominantly with the dynamics of macroeconomic time series, both in the context of a simple DSGE model, as well as from a pure time series modeling perspective.
12

Online stochastic algorithms / Algorithmes stochastiques en ligne

Li, Le 27 November 2018 (has links)
Cette thèse travaille principalement sur trois sujets. Le premier concentre sur le clustering en ligne dans lequel nous présentons un nouvel algorithme stochastique adaptatif pour regrouper des ensembles de données en ligne. Cet algorithme repose sur l'approche quasi-bayésienne, avec une estimation dynamique (i.e., dépendant du temps) du nombre de clusters. Nous prouvons que cet algorithme atteint une borne de regret de l'ordre et que cette borne est asymptotiquement minimax sous la contrainte sur le nombre de clusters. Nous proposons aussi une implémentation par RJMCMC. Le deuxième sujet est lié à l'apprentissage séquentiel des courbes principales qui cherche à résumer une séquence des données par une courbe continue. Pour ce faire, nous présentons une procédure basée sur une approche maximum a posteriori pour le quasi-posteriori de Gibbs. Nous montrons que la borne de regret de cet algorithme et celui de sa version adaptative est sous-linéaire en l'horizon temporel T. En outre, nous proposons une implémentation par un algorithme glouton local qui intègre des éléments de sleeping experts et de bandit à plusieurs bras. Le troisième concerne les travaux qui visent à accomplir des tâches pratiques au sein d'iAdvize, l'entreprise qui soutient cette thèse. Il inclut l'analyse des sentiments pour les messages textuels et l'implémentation de chatbot dans lesquels la première est réalisé par les méthodes classiques dans la fouille de textes et les statistiques et la seconde repose sur le traitement du langage naturel et les réseaux de neurones artificiels. / This thesis works mainly on three subjects. The first one is online clustering in which we introduce a new and adaptive stochastic algorithm to cluster online dataset. It relies on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent) estimation of the (unknown and changing) number of clusters. We prove that this algorithm has a regret bound of the order of and is asymptotically minimax under the constraint on the number of clusters. A RJMCMC-flavored implementation is also proposed. The second subject is related to the sequential learning of principal curves which seeks to represent a sequence of data by a continuous polygonal curve. To this aim, we introduce a procedure based on the MAP of Gibbs-posterior that can give polygonal lines whose number of segments can be chosen automatically. We also show that our procedure is supported by regret bounds with sublinear remainder terms. In addition, a greedy local search implementation that incorporates both sleeping experts and multi-armed bandit ingredients is presented. The third one concerns about the work which aims to fulfilling practical tasks within iAdvize, the company which supports this thesis. It includes sentiment analysis for textual messages by using methods in both text mining and statistics, and implementation of chatbot based on nature language processing and neural networks.
13

A Bayesian Approach to Detect the Onset of Activity Limitation Among Adults in NHIS

Bai, Yan 06 May 2005 (has links)
Data from the 1995 National Health Interview Survey (NHIS) indicate that, due to chronic conditions, the onset of activity limitation typically occurs between age 40-70 years (i.e., the proportion of young adults with activity limitation is small and roughly constant with age and then it starts to change, roughly increasing). We use a Bayesian hierarchical model to detect the change point of a positive activity limitation status (ALS) across twelve domains based on race, gender, and education. We have two types of data: weighted and unweighted. We obtain weighted binomial counts using a regression analysis with the sample weights. Given the proportion of individuals in the population with positive ALS, we assume that the number of individuals with positive ALS at each age group has a binomial probability mass function. The proportions across age are different, and have the same beta distribution up to the change point (unknown), and the proportions after the change point have a different beta distribution. We consider two different analyses. The first considers each domain individually in its own model and the second considers the twelve domains simultaneously in a single model to“borrow strength" as in small area estimation. It is reasonable to assume that each domain has its own onset.In the first analysis, we use the Gibbs sampler to fit the model, and a computation of the marginal likelihoods, using an output analysis from the Gibbs sampler, provides the posterior distribution of the change point. We note that a reversible jump sampler fails in this analysis because it tends to get stuck either age 40 or age 70. In the second analysis, we use the Gibbs sampler to fit only the joint posterior distribution of the twelve change points. This is a difficult problem because the joint density requires the numerical computation of a triple integral at each iteration. The other parameters of the process are obtained using data augmentation by a Metropolis sampler and a Rao-Blackwellization. We found that overall the age of onset is about 50 to 60 years.
14

Méthodes Bayésiennes pour le démélange d'images hyperspectrales / Bayesian methods for hyperspectral image unmixing

Eches, Olivier 14 October 2010 (has links)
L’imagerie hyperspectrale est très largement employée en télédétection pour diverses applications, dans le domaine civil comme dans le domaine militaire. Une image hyperspectrale est le résultat de l’acquisition d’une seule scène observée dans plusieurs longueurs d’ondes. Par conséquent, chacun des pixels constituant cette image est représenté par un vecteur de mesures (généralement des réflectances) appelé spectre. Une étape majeure dans l’analyse des données hyperspectrales consiste à identifier les composants macroscopiques (signatures) présents dans la région observée et leurs proportions correspondantes (abondances). Les dernières techniques développées pour ces analyses ne modélisent pas correctement ces images. En effet, habituellement ces techniques supposent l’existence de pixels purs dans l’image, c’est-à-dire des pixels constitué d’un seul matériau pur. Or, un pixel est rarement constitué d’éléments purs distincts l’un de l’autre. Ainsi, les estimations basées sur ces modèles peuvent tout à fait s’avérer bien loin de la réalité. Le but de cette étude est de proposer de nouveaux algorithmes d’estimation à l’aide d’un modèle plus adapté aux propriétés intrinsèques des images hyperspectrales. Les paramètres inconnus du modèle sont ainsi déduits dans un cadre Bayésien. L’utilisation de méthodes de Monte Carlo par Chaînes de Markov (MCMC) permet de surmonter les difficultés liées aux calculs complexes de ces méthodes d’estimation. / Hyperspectral imagery has been widely used in remote sensing for various civilian and military applications. A hyperspectral image is acquired when a same scene is observed at different wavelengths. Consequently, each pixel of such image is represented as a vector of measurements (reflectances) called spectrum. One major step in the analysis of hyperspectral data consists of identifying the macroscopic components (signatures) that are present in the sensored scene and the corresponding proportions (concentrations). The latest techniques developed for this analysis do not properly model these images. Indeed, these techniques usually assume the existence of pure pixels in the image, i.e. pixels containing a single pure material. However, a pixel is rarely composed of pure spectrally elements, distinct from each other. Thus, such models could lead to weak estimation performance. The aim of this thesis is to propose new estimation algorithms with the help of a model that is better suited to the intrinsic properties of hyperspectral images. The unknown model parameters are then infered within a Bayesian framework. The use of Markov Chain Monte Carlo (MCMC) methods allows one to overcome the difficulties related to the computational complexity of these inference methods.
15

Mapeamento de QTLs utilizando as abordagens Clássica e Bayesiana / Mapping QTLs: Classical and Bayesian approaches

Toledo, Elisabeth Regina de 02 October 2006 (has links)
A produção de grãos e outros caracteres de importância econômica para a cultura do milho, tais como a altura da planta, o comprimento e o diâmetro da espiga, apresentam herança poligênica, o que dificulta a obtenção de informações sobre as bases genéticas envolvidas na variação desses caracteres. Associações entre marcadores e QTLs foram analisadas através dos métodos de mapeamento por intervalo composto (CIM) e mapeamento por intervalo Bayesiano (BIM). A partir de um conjunto de dados de produção de grãos, referentes à avaliação de 256 progênies de milho genotipadas para 139 marcadores moleculares codominantes, verificou-se que as metodologias apresentadas permitiram classificar marcas associadas a QTLs. Através do procedimento CIM, associações entre marcadores e QTLs foram consideradas significativas quando o valor da estatística de razão de verossimilhança (LR) ao longo do cromossomo atingiu o valor máximo dentre os que ultrapassaram o limite crítico LR = 11; 5 no intervalo considerado. Dez QTLs foram mapeados distribuídos em três cromossomos. Juntos, explicaram 19,86% da variância genética. Os tipos de interação alélica predominantes foram de dominância parcial (quatro QTLs) e dominância completa (três QTLs). O grau médio de dominância calculado foi de 1,12, indicando grau médio de dominância completa. Grande parte dos alelos favoráveis ao caráter foram provenientes da linhagem parental L0202D, que apresentou mais elevada produção de grãos. Adotando-se a abordagem Bayesiana, foram implementados métodos de amostragem através de cadeias de Markov (MCMC), para obtenção de uma amostra da distribuição a posteriori dos parâmetros de interesse, incorporando as crenças e incertezas a priori. Resumos sobre as localizações dos QTLs e seus efeitos aditivo e de dominância foram obtidos. Métodos MCMC com saltos reversíveis (RJMCMC) foram utilizados para a análise Bayesiana e Fator calculado de Bayes para estimar o número de QTLs. Através do método BIM associações entre marcadores e QTLs foram consideradas significativas em quatro cromossomos, com um total de cinco QTLs mapeados. Juntos, esses QTLs explicaram 13,06% da variância genética. A maior parte dos alelos favoráveis ao caráter também foram provenientes da linhagem parental L02-02D. / Grain yield and other important economic traits in maize, such as plant heigth, stalk length, and stalk diameter, exhibit polygenic inheritance, making dificult information achievement about the genetic bases related to the variation of these traits. The number and sites of (QTLs) loci that control grain yield in maize have been estimated. Associations between markers and QTLs were undertaken by composite interval mapping (CIM) and Bayesian interval mapping (BIM). Based on a set of grain yield data, obtained from the evaluation of 256 maize progenies genotyped for 139 codominant molecular markers, the presented methodologies allowed classification of markers associated to QTLs.Through composite interval mapping were significant when value of likelihood ratio (LR) throughout the chromosome surpassed LR = 11; 5. Significant associations between markers and QTLs were obtained in three chromosomes, ten QTLs has been mapped, which explained 19; 86% of genetic variation. Predominant genetic action for mapped QTLs was partial dominance and (four QTLs) complete dominance (tree QTLs). Average dominance amounted to 1,12 and confirmed complete dominance for grain yield. Most alleles that contributed positively in trait came from parental strain L0202D. The latter had the highest yield rate. Adopting a Bayesian approach to inference, usually implemented via Markov chain Monte Carlo (MCMC). The output of a Bayesian analysis is a posterior distribution on the parameters, fully incorporating prior beliefs and parameter uncertainty. Reversible Jump MCMC (RJMCMC) is used in this work. Bayes Factor is used to estimate the number of QTL. Through Bayesian interval, significant associations between markers and QTLs were obtained in four chromosomes and five QTLs has been mapped, which explained 13; 06% of genetic variation. Most alleles that contributed positively in trait came from parental strain L02-02D. The latter had the highest yield rate.
16

Stochastic process analysis for Genomics and Dynamic Bayesian Networks inference.

Lebre, Sophie 14 September 2007 (has links) (PDF)
This thesis is dedicated to the development of statistical and computational methods for the analysis of DNA sequences and gene expression time series.<br /><br />First we study a parsimonious Markov model called Mixture Transition Distribution (MTD) model which is a mixture of Markovian transitions. The overly high number of constraints on the parameters of this model hampers the formulation of an analytical expression of the Maximum Likelihood Estimate (MLE). We propose to approach the MLE thanks to an EM algorithm. After comparing the performance of this algorithm to results from the litterature, we use it to evaluate the relevance of MTD modeling for bacteria DNA coding sequences in comparison with standard Markovian modeling.<br /><br />Then we propose two different approaches for genetic regulation network recovering. We model those genetic networks with Dynamic Bayesian Networks (DBNs) whose edges describe the dependency relationships between time-delayed genes expression. The aim is to estimate the topology of this graph despite the overly low number of repeated measurements compared with the number of observed genes. <br /><br />To face this problem of dimension, we first assume that the dependency relationships are homogeneous, that is the graph topology is constant across time. Then we propose to approximate this graph by considering partial order dependencies. The concept of partial order dependence graphs, already introduced for static and non directed graphs, is adapted and characterized for DBNs using the theory of graphical models. From these results, we develop a deterministic procedure for DBNs inference. <br /><br />Finally, we relax the homogeneity assumption by considering the succession of several homogeneous phases. We consider a multiple changepoint<br />regression model. Each changepoint indicates a change in the regression model parameters, which corresponds to the way an expression level depends on the others. Using reversible jump MCMC methods, we develop a stochastic algorithm which allows to simultaneously infer the changepoints location and the structure of the network within the phases delimited by the changepoints. <br /><br />Validation of those two approaches is carried out on both simulated and real data analysis.
17

一種基於BIC的B-Spline節點估計方式

何昕燁, Ho, Hsin Yeh Unknown Date (has links)
在迴歸分析中,若變數間具有非線性的關係時,B-Spline線性迴歸是以無母數的方式建立模型。B-Spline函數為具有節點(knots)的分段多項式,選取合適節點的位置對B-Spline的估計有重要的影響,在近年來許多的文獻中已提出一些尋找節點位置的估計方法,而本文中我們提出了一種基於Bayesian information criterion(BIC)的節點估計方式。 我們想要深入了解在不同類型的迴歸函數間,各種選取節點方法的配適效果與模擬時間,並且加以比較,在使用B-Spline函數估計時,能夠使用合適的方法尋找節點。 / In regression analysis, when the relation between the response variable and the explanatory variable is nonlinear, one can use nonparametric methods to estimate the regression function. B-Spline regression is one of the popular nonparametric regression methods. B-Splines are piecewise polynomial joint at knots, and the choice of knot locations is crucial. Zhou and Shen (2001) proposed to use spatially adaptive regression splines (SARS), where the knots are estimated using a selection scheme. Dimatteo, Genovese, and Kass (2001) proposed to use Bayesian adaptive regression splines (BARS), where certain priors for knot locations are considered. In this thesis, a knot estimation method based on the Bayesian information criterion (BIC) is proposed, and simulation studies are carried out to compare BARS, SARS and the proposed BIC-based method.
18

Bayesian Uncertainty Quantification for Large Scale Spatial Inverse Problems

Mondal, Anirban 2011 August 1900 (has links)
We considered a Bayesian approach to nonlinear inverse problems in which the unknown quantity is a high dimension spatial field. The Bayesian approach contains a natural mechanism for regularization in the form of prior information, can incorporate information from heterogeneous sources and provides a quantitative assessment of uncertainty in the inverse solution. The Bayesian setting casts the inverse solution as a posterior probability distribution over the model parameters. Karhunen-Lo'eve expansion and Discrete Cosine transform were used for dimension reduction of the random spatial field. Furthermore, we used a hierarchical Bayes model to inject multiscale data in the modeling framework. In this Bayesian framework, we have shown that this inverse problem is well-posed by proving that the posterior measure is Lipschitz continuous with respect to the data in total variation norm. The need for multiple evaluations of the forward model on a high dimension spatial field (e.g. in the context of MCMC) together with the high dimensionality of the posterior, results in many computation challenges. We developed two-stage reversible jump MCMC method which has the ability to screen the bad proposals in the first inexpensive stage. Channelized spatial fields were represented by facies boundaries and variogram-based spatial fields within each facies. Using level-set based approach, the shape of the channel boundaries was updated with dynamic data using a Bayesian hierarchical model where the number of points representing the channel boundaries is assumed to be unknown. Statistical emulators on a large scale spatial field were introduced to avoid the expensive likelihood calculation, which contains the forward simulator, at each iteration of the MCMC step. To build the emulator, the original spatial field was represented by a low dimensional parameterization using Discrete Cosine Transform (DCT), then the Bayesian approach to multivariate adaptive regression spline (BMARS) was used to emulate the simulator. Various numerical results were presented by analyzing simulated as well as real data.
19

Algoritmo ejeção-absorção metropolizado para segmentação de imagens

Calixto, Alexandre Pitangui 19 December 2014 (has links)
Made available in DSpace on 2016-06-02T20:04:53Z (GMT). No. of bitstreams: 1 6510.pdf: 2213423 bytes, checksum: 0c9b206a1b5f88772031ed160e9691b3 (MD5) Previous issue date: 2014-12-19 / Financiadora de Estudos e Projetos / We proposed a new split-merge MCMC algorithm for image segmentation. We describe how an image can be subdivided into multiple disjoint regions, with each region having an associated latent indicator variable. The latent indicator variables are modeled with a prior Gibbs distribution governed by a spatial regularization parameter. Regions with same label define a component. Pixels within a component are distributed according to a Gaussian distribution. We treat the spatial regularization parameter and the number of components K as unknown. To estimate K, the spatial regularization parameter and the component parameters we propose the Metropolised split-merge (MSM) algorithm. The MSM comprises two type of moves. The first one, is a data-driven split-merge move. These movements change the number of components K in the neighborhood K _ 1 and are accepted according to Metropolis-Hastings acceptance probability. After a split-merge step, the component parameters, the spatial regularization parameter and latent allocation variables are updated conditional on K by using the Gibbs sampling, the Metropolis- Hastings and Swendsen-Wang algorithm, respectively. The main advantage of the proposed algorithm is that it is easy to implement and the acceptance probability for split-merge movements depends only of the observed data. The performance of the proposed algorithm is verified using artificial datasets as well as real datasets. / Nesta tese, modelamos uma imagem através de uma grade regular retangular e assumimos que esta grade é dividida em múltiplas regiões disjuntas de pixels. Quando duas ou mais regiões apresentam a mesma característica, a união dessas regiões forma um conjunto chamado de componente. Associamos a cada pixel da imagem uma variável indicadora não observável que indica a componente a que o pixel pertence. Estas variáveis indicadoras não observáveis são modeladas através da distribuição de probabilidade de Gibbs com parâmetro de regularização espacial _. Assumimos que _ e o número de componentes K são desconhecidos. Para estimação conjunta dos parâmetros de interesse, propomos um algoritmo MCMC denominado de ejeção-absorção metropolizado (EAM). Algumas vantagens do algoritmo proposto são: (i) O algoritmo não necessita da especificação de uma função de transição para realização dos movimentos ejeção e absorção. Ao contrário do algoritmo reversible jump (RJ) que requer a especificação de boas funções de transição para ser computacionalmente eficiente; (ii) Os movimentos ejeção e absorção são desenvolvidos com base nos dados observados e podem ser rapidamente propostos e testados; (iii) Novas componentes são criadas com base em informações provenientes de regiões de observações e os parâmetros das novas componentes são gerados das distribuições a posteriori. Ilustramos o desempenho do algoritmo EAM utilizando conjuntos de dados simulados e reais.
20

Mapeamento de QTLs utilizando as abordagens Clássica e Bayesiana / Mapping QTLs: Classical and Bayesian approaches

Elisabeth Regina de Toledo 02 October 2006 (has links)
A produção de grãos e outros caracteres de importância econômica para a cultura do milho, tais como a altura da planta, o comprimento e o diâmetro da espiga, apresentam herança poligênica, o que dificulta a obtenção de informações sobre as bases genéticas envolvidas na variação desses caracteres. Associações entre marcadores e QTLs foram analisadas através dos métodos de mapeamento por intervalo composto (CIM) e mapeamento por intervalo Bayesiano (BIM). A partir de um conjunto de dados de produção de grãos, referentes à avaliação de 256 progênies de milho genotipadas para 139 marcadores moleculares codominantes, verificou-se que as metodologias apresentadas permitiram classificar marcas associadas a QTLs. Através do procedimento CIM, associações entre marcadores e QTLs foram consideradas significativas quando o valor da estatística de razão de verossimilhança (LR) ao longo do cromossomo atingiu o valor máximo dentre os que ultrapassaram o limite crítico LR = 11; 5 no intervalo considerado. Dez QTLs foram mapeados distribuídos em três cromossomos. Juntos, explicaram 19,86% da variância genética. Os tipos de interação alélica predominantes foram de dominância parcial (quatro QTLs) e dominância completa (três QTLs). O grau médio de dominância calculado foi de 1,12, indicando grau médio de dominância completa. Grande parte dos alelos favoráveis ao caráter foram provenientes da linhagem parental L0202D, que apresentou mais elevada produção de grãos. Adotando-se a abordagem Bayesiana, foram implementados métodos de amostragem através de cadeias de Markov (MCMC), para obtenção de uma amostra da distribuição a posteriori dos parâmetros de interesse, incorporando as crenças e incertezas a priori. Resumos sobre as localizações dos QTLs e seus efeitos aditivo e de dominância foram obtidos. Métodos MCMC com saltos reversíveis (RJMCMC) foram utilizados para a análise Bayesiana e Fator calculado de Bayes para estimar o número de QTLs. Através do método BIM associações entre marcadores e QTLs foram consideradas significativas em quatro cromossomos, com um total de cinco QTLs mapeados. Juntos, esses QTLs explicaram 13,06% da variância genética. A maior parte dos alelos favoráveis ao caráter também foram provenientes da linhagem parental L02-02D. / Grain yield and other important economic traits in maize, such as plant heigth, stalk length, and stalk diameter, exhibit polygenic inheritance, making dificult information achievement about the genetic bases related to the variation of these traits. The number and sites of (QTLs) loci that control grain yield in maize have been estimated. Associations between markers and QTLs were undertaken by composite interval mapping (CIM) and Bayesian interval mapping (BIM). Based on a set of grain yield data, obtained from the evaluation of 256 maize progenies genotyped for 139 codominant molecular markers, the presented methodologies allowed classification of markers associated to QTLs.Through composite interval mapping were significant when value of likelihood ratio (LR) throughout the chromosome surpassed LR = 11; 5. Significant associations between markers and QTLs were obtained in three chromosomes, ten QTLs has been mapped, which explained 19; 86% of genetic variation. Predominant genetic action for mapped QTLs was partial dominance and (four QTLs) complete dominance (tree QTLs). Average dominance amounted to 1,12 and confirmed complete dominance for grain yield. Most alleles that contributed positively in trait came from parental strain L0202D. The latter had the highest yield rate. Adopting a Bayesian approach to inference, usually implemented via Markov chain Monte Carlo (MCMC). The output of a Bayesian analysis is a posterior distribution on the parameters, fully incorporating prior beliefs and parameter uncertainty. Reversible Jump MCMC (RJMCMC) is used in this work. Bayes Factor is used to estimate the number of QTL. Through Bayesian interval, significant associations between markers and QTLs were obtained in four chromosomes and five QTLs has been mapped, which explained 13; 06% of genetic variation. Most alleles that contributed positively in trait came from parental strain L02-02D. The latter had the highest yield rate.

Page generated in 0.0763 seconds