301 |
Modèles stochastiques pour la reconstruction tridimensionnelle d'environnements urbainsLafarge, Florent 02 October 2007 (has links) (PDF)
Cette thèse aborde le problème de la reconstruction tridimensionnelle de zones urbaines à partir d'images satellitaires très haute résolution. Le contenu informatif de ce type de données est insuffisant pour permettre une utilisation efficace des nombreux algorithmes développés pour des données aériennes. Dans ce contexte, l'introduction de connaissances a priori fortes sur les zones urbaines est nécessaire. Les outils stochastiques sont particulièrement bien adaptés pour traiter cette problématique.<br /><br />Nous proposons une approche structurelle pour aborder ce sujet. Cela consiste à modéliser un bâtiment comme un assemblage de modules urbains élémentaires extraits d'une bibliothèque de modèles 3D paramétriques. Dans un premier temps, nous extrayons les supports 2D de ces modules à partir d'un Modèle Numérique d' Elévation (MNE). Le résultat est un agencement de quadrilatères dont les éléments voisins sont connectés entre eux. Ensuite, nous reconstruisons les bâtiments en recherchant la configuration optimale de modèles 3D se fixant sur les supports précédemment extraits. Cette configuration correspond à la réalisation qui maximise une densité mesurant la cohérence entre la réalisation et le MNE, mais également prenant en compte des connaissances a priori telles que des lois d'assemblage des modules. Nous discutons enfin de la pertinence de cette approche en analysant les résultats obtenus à partir de données satellitaires (simulations PLEIADES). Des expérimentations sont également réalisées à partir d'images aériennes mieux résolues.
|
302 |
Modélisation markovienne des dynamiques d'usages des sols. Cas des parcelles situées sur le bord du corridor forestier {Ranomafana-Andringitra}Raherinirina, Angelo 08 February 2013 (has links) (PDF)
Nous proposons une démarche markovienne d'inférence et de modéli- sation de dynamiques agraires dans le cadre d'usage de parcelles situées en lisière du corridor forestier reliant les deux parcs nationaux de Ranomafana et d'An- dringitra. La préservation de la forêt de la côte est de Madagascar est cruciale, il est donc pertinent de développer des outils permettant de mieux comprendre les dynamiques de déforestation puis de l'usage des parcelles et enfin de leur éventuel retour à l'état de forêt. Nous nous appuyons sur deux jeux de don- nées de terrain établis par l'IRD. Dans ce genre d'étude, une étape préliminaire consiste à construire la matrice de transition empirique, cela s'apparente donc à une modélisation markovienne de la dynamique. Dans ce cadre nous considérons l'approche par maximum de vraisemblance et l'approche bayésienne. Cette der- nière approche nous permet d'intégrer des informations non-présentes dans les données mais reconnues par les spécialistes, elle fait appel à des techniques d'ap- proximation de Monte Carlo par chaînes de Markov (MCMC). Nous étudions les propriétés asymptotiques des modèles obtenus à l'aide de ces deux approches et notamment le temps de convergence vers la loi quasi-stationnaire dans le premier cas et vers la loi stationnaire dans le second. Nous testons différentes hypothèses portant sur les modèles. Cette approche markovienne n'est plus valide sur le deuxième jeu de données, plus étendu, où il a fallu faire appel à une approche semi-markovienne : les lois des temps de séjour dans un état donné ne sont plus né- cessairement géométriques et peuvent dépendre de l'état suivant. À nouveau nous faisons appel aux approches par maximum de vraisemblance et bayésienne. Nous étudions le comportement asymptotique de chacun de ces modèles. En termes applicatifs, nous avons pu déterminer les échelles de temps de ces dynamiques.
|
303 |
Bayesian Cluster Analysis : Some Extensions to Non-standard SituationsFranzén, Jessica January 2008 (has links)
The Bayesian approach to cluster analysis is presented. We assume that all data stem from a finite mixture model, where each component corresponds to one cluster and is given by a multivariate normal distribution with unknown mean and variance. The method produces posterior distributions of all cluster parameters and proportions as well as associated cluster probabilities for all objects. We extend this method in several directions to some common but non-standard situations. The first extension covers the case with a few deviant observations not belonging to one of the normal clusters. An extra component/cluster is created for them, which has a larger variance or a different distribution, e.g. is uniform over the whole range. The second extension is clustering of longitudinal data. All units are clustered at all time points separately and the movements between time points are modeled by Markov transition matrices. This means that the clustering at one time point will be affected by what happens at the neighbouring time points. The third extension handles datasets with missing data, e.g. item non-response. We impute the missing values iteratively in an extra step of the Gibbs sampler estimation algorithm. The Bayesian inference of mixture models has many advantages over the classical approach. However, it is not without computational difficulties. A software package, written in Matlab for Bayesian inference of mixture models is introduced. The programs of the package handle the basic cases of clustering data that are assumed to arise from mixture models of multivariate normal distributions, as well as the non-standard situations.
|
304 |
Estimation bayésienne nonparamétrique de copulesGuillotte, Simon January 2008 (has links)
Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal.
|
305 |
Bayesian and Frequentist Approaches for the Analysis of Multiple Endpoints Data Resulting from Exposure to Multiple Health Stressors.Nyirabahizi, Epiphanie 08 March 2010 (has links)
In risk analysis, Benchmark dose (BMD)methodology is used to quantify the risk associated with exposure to stressors such as environmental chemicals. It consists of fitting a mathematical model to the exposure data and the BMD is the dose expected to result in a pre-specified response or benchmark response (BMR). Most available exposure data are from single chemical exposure, but living objects are exposed to multiple sources of hazards. Furthermore, in some studies, researchers may observe multiple endpoints on one subject. Statistical approaches to address multiple endpoints problem can be partitioned into a dimension reduction group and a dimension preservative group. Composite scores using desirability function is used, as a dimension reduction method, to evaluate neurotoxicity effects of a mixture of five organophosphate pesticides (OP) at a fixed mixing ratio ray, and five endpoints were observed. Then, a Bayesian hierarchical model approach, as a single unifying dimension preservative method is introduced to evaluate the risk associated with the exposure to mixtures chemicals. At a pre-specied vector of BMR of interest, the method estimates a tolerable area referred to as benchmark dose tolerable area (BMDTA) in multidimensional Euclidean plan. Endpoints defining the BMDTA are determined and model uncertainty and model selection problems are addressed by using the Bayesian Model Averaging (BMA) method.
|
306 |
Modélisation de la variabilité inter-individuelle dans les modèles de croissance de plantes et sélection de modèles pour la prévision / Modelling inter-individual variability in plant growth models and model selection for predictionBaey, Charlotte 28 February 2014 (has links)
La modélisation de la croissance des plantes a vu le jour à la fin du XXème siècle, à l’intersection de trois disciplines : l’agronomie, la botanique et l’informatique. Après un premier élan qui a donné naissance à un grand nombre de modèles, un deuxième courant a vu le jour au cours de la dernière décennie pour donner à ces modèles un formalisme mathématique et statistique rigoureux. Les travaux développés dans cette thèse s’inscrivent dans cette démarche et proposent deux axes de développement, l’un autour de l’évaluation et de la comparaison de modèles, et l’autre autour de l’étude de la variabilité inter-plantes.Dans un premier temps, nous nous sommes intéressés à la capacité prédictive des modèles de croissance de plantes, en appliquant une méthodologie permettant de construire et d’évaluer des modèles qui seront utilisés comme outils prédictifs. Une première étape d’analyse de sensibilité permet d’identifier les paramètres les plus influents afin d’élaborer une version plus robuste de chaque modèle, puis les capacités prédictives des modèles sont comparées à l’aide de critères appropriés. Cette étude a été appliquée au cas de la betterave sucrière mais peut se généraliser à d’autres plantes.La deuxième partie de la thèse concerne la prise en compte de la variabilité inter-individuelle dans les populations de plantes. Il existe en effet une forte variabilité entre plantes, d’origine génétique ou environnementale, dont il est nécessaire de tenir compte. Nous proposons dans cette thèse une approche basée sur l’utilisation de modèles (non linéaires) à effets mixtes pour caractériser la variabilité inter- individuelle. L’estimation paramétrique par maximum de vraisemblance nécessite l’utilisation de versions stochastiques de l’algorithme d’Espérance Maximisation basées sur des simulations de type Monte Carlo par Chaîne de Markov. Après une première application au cas de l’organogenèse chez la betterave sucrière, nous proposons une extension du modèle structure-fonction Greenlab à l’échelle de la population, appliqué aux cas de la betterave sucrière et du colza. / The modelling of plant growth and development was born at the end of the XXth century at the intersection of three disciplines: agronomy, botany and computer science. After a first period corresponding to the emergence of a lot of different models, a new trend has been initiated in the last decade to give these models a rigorous mathematical and statistical formalism. This thesis focuses on two main areas of development: (i) models evaluation and comparison, and (ii) inter-individual variability in plant populations.In the first part of the thesis, we study the predictive capacity of plant growth models, and we apply a two-step methodology to build and evaluate different models in a predictive perspective. In a first step, a sensitivity analysis is conducted to identify the most influential parameters and elaborate a more robust version of each model, and in a second step the predictive capacities of the models are compared using appropriate criteria. This study is carried out on sugar beet crops but can be easily generalized to other species.The second part of this thesis concerns the inter-individual variability in plant populations, which can be very high due to genetics or environmental varying conditions. This variability is rarely accounted for despite the major impact it can have at the agrosystem level. We proposed to take it into account using (nonlinear) mixed models, for which parameter estimation using maximum likelihood method relies on the use of stochastic variants of the Expectation-Maximization algorithm, based on Markov Chain Monte Carlo simulation techniques. We first apply this approach to the case of organogenesis in sugar beet populations, and secondly, we develop an extension of the functional-structural plant growth model Greenlab, from the individual to the population scale.
|
307 |
Generating Evidence for COPD Clinical Guidelines Using EHRsAmber M Johnson (7023350) 14 August 2019 (has links)
The Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelinesare used to guide clinical practices for treating Chronic Obstructive Pulmonary Disease (COPD). GOLD focuses heavily on stable COPD patients, limiting its use fornon-stable COPD patients such as those with severe, acute exacerbations of COPD (AECOPD) that require hospitalization. Although AECOPD can be heterogeneous, it can lead to deterioration of health and early death. Electronic health records (EHRs) can be used to analyze patient data for understanding disease progression and generating guideline evidence for AECOPD patients. However, because of its structure and representation, retrieving, analyzing, and properly interpreting EHR data can be challenging, and existing tools do not provide granular analytic capabil-ities for this data.<div><br></div><div>This dissertation presents, develops, and implements a novel approach that systematically captures the effect of interventions during patient medical encounters, and hence may support evidence generation for clinical guidelines in a systematic and principled way. A conceptual framework that structures components, such as data storage, aggregation, extraction, and visualization, to support EHR data analytics for granular analysis is introduced. We develop a software framework in Python based on these components to create longitudinal representations of raw medical data extracted from the Medical Information Mart for Intensive Care (MIMIC-III) clinical database. The software framework consists of two tools: Patient Aggregated Care Events (PACE), a novel tool for constructing and visualizing entire medical histories of both individual patients and patient cohorts, and Mark SIM, a Markov Chain Monte Carlo modeling and simulation tool for predicting clinical outcomes through probabilistic analysis that captures granular temporal aspects of aggregated, clinicaldata.<br></div><div><br></div><div>We assess the efficacy of antibiotic treatment and the optimal time of initiationfor in-hospitalized AECOPD patients as an application to probabilistic modeling. We identify 697 AECOPD patients of which 26.0% were administered antibiotics. Our model simulations show a 50% decrease in mortality rate as the number of patients administered antibiotics increase, and an estimated 5.5% mortality rate when antibiotics are initially administrated after 48 hours vs 1.8% when antibiotics are initially administrated between 24 and 48 hours. Our findings suggest that there may be amortality benefit in initiation of antibiotics early in patients with acute respiratory failure in ICU patients with severe AECOPD.<br></div><div><br></div><div>Thus, we show that it is feasible to enhance representation of EHRs to aggregate patients’ entire medical histories with temporal trends and support complex clinical questions to drive clinical guidelines for COPD.<br></div>
|
308 |
An application of Bayesian Hidden Markov Models to explore traffic flow conditions in an urban areaAndersson, Lovisa January 2019 (has links)
This study employs Bayesian Hidden Markov Models as method to explore vehicle traffic flow conditions in an urban area in Stockholm, based on sensor data from separate road positions. Inter-arrival times are used as the observed sequences. These sequences of inter-arrival times are assumed to be generated from the distributions of four different (and hidden) traffic flow states; nightly free flow, free flow, mixture and congestion. The filtered and smoothed probability distributions of the hidden states and the most probable state sequences are obtained by using the forward, forward-backward and Viterbi algorithms. The No-U-Turn sampler is used to sample from the posterior distributions of all unknown parameters. The obtained results show in a satisfactory way that the Hidden Markov Models can detect different traffic flow conditions. Some of the models have problems with divergence, but the obtained results from those models still show satisfactory results. In fact, two of the models that converged seemed to overestimate the presence of congested traffic and all the models that not converged seem to do adequate estimations of the probability of being in a congested state. Since the interest of this study lies in estimating the current traffic flow condition, and not in doing parameter inference, the model choice of Bayesian Hidden Markov Models is satisfactory. Due to the unsupervised nature of the problematization of this study, it is difficult to evaluate the accuracy of the results. However, a model with simulated data and known states was also implemented, which resulted in a high classification accuracy. This indicates that the choice of Hidden Markov Models is a good model choice for estimating traffic flow conditions.
|
309 |
Mapeamento de QTLs utilizando as abordagens Clássica e Bayesiana / Mapping QTLs: Classical and Bayesian approachesToledo, Elisabeth Regina de 02 October 2006 (has links)
A produção de grãos e outros caracteres de importância econômica para a cultura do milho, tais como a altura da planta, o comprimento e o diâmetro da espiga, apresentam herança poligênica, o que dificulta a obtenção de informações sobre as bases genéticas envolvidas na variação desses caracteres. Associações entre marcadores e QTLs foram analisadas através dos métodos de mapeamento por intervalo composto (CIM) e mapeamento por intervalo Bayesiano (BIM). A partir de um conjunto de dados de produção de grãos, referentes à avaliação de 256 progênies de milho genotipadas para 139 marcadores moleculares codominantes, verificou-se que as metodologias apresentadas permitiram classificar marcas associadas a QTLs. Através do procedimento CIM, associações entre marcadores e QTLs foram consideradas significativas quando o valor da estatística de razão de verossimilhança (LR) ao longo do cromossomo atingiu o valor máximo dentre os que ultrapassaram o limite crítico LR = 11; 5 no intervalo considerado. Dez QTLs foram mapeados distribuídos em três cromossomos. Juntos, explicaram 19,86% da variância genética. Os tipos de interação alélica predominantes foram de dominância parcial (quatro QTLs) e dominância completa (três QTLs). O grau médio de dominância calculado foi de 1,12, indicando grau médio de dominância completa. Grande parte dos alelos favoráveis ao caráter foram provenientes da linhagem parental L0202D, que apresentou mais elevada produção de grãos. Adotando-se a abordagem Bayesiana, foram implementados métodos de amostragem através de cadeias de Markov (MCMC), para obtenção de uma amostra da distribuição a posteriori dos parâmetros de interesse, incorporando as crenças e incertezas a priori. Resumos sobre as localizações dos QTLs e seus efeitos aditivo e de dominância foram obtidos. Métodos MCMC com saltos reversíveis (RJMCMC) foram utilizados para a análise Bayesiana e Fator calculado de Bayes para estimar o número de QTLs. Através do método BIM associações entre marcadores e QTLs foram consideradas significativas em quatro cromossomos, com um total de cinco QTLs mapeados. Juntos, esses QTLs explicaram 13,06% da variância genética. A maior parte dos alelos favoráveis ao caráter também foram provenientes da linhagem parental L02-02D. / Grain yield and other important economic traits in maize, such as plant heigth, stalk length, and stalk diameter, exhibit polygenic inheritance, making dificult information achievement about the genetic bases related to the variation of these traits. The number and sites of (QTLs) loci that control grain yield in maize have been estimated. Associations between markers and QTLs were undertaken by composite interval mapping (CIM) and Bayesian interval mapping (BIM). Based on a set of grain yield data, obtained from the evaluation of 256 maize progenies genotyped for 139 codominant molecular markers, the presented methodologies allowed classification of markers associated to QTLs.Through composite interval mapping were significant when value of likelihood ratio (LR) throughout the chromosome surpassed LR = 11; 5. Significant associations between markers and QTLs were obtained in three chromosomes, ten QTLs has been mapped, which explained 19; 86% of genetic variation. Predominant genetic action for mapped QTLs was partial dominance and (four QTLs) complete dominance (tree QTLs). Average dominance amounted to 1,12 and confirmed complete dominance for grain yield. Most alleles that contributed positively in trait came from parental strain L0202D. The latter had the highest yield rate. Adopting a Bayesian approach to inference, usually implemented via Markov chain Monte Carlo (MCMC). The output of a Bayesian analysis is a posterior distribution on the parameters, fully incorporating prior beliefs and parameter uncertainty. Reversible Jump MCMC (RJMCMC) is used in this work. Bayes Factor is used to estimate the number of QTL. Through Bayesian interval, significant associations between markers and QTLs were obtained in four chromosomes and five QTLs has been mapped, which explained 13; 06% of genetic variation. Most alleles that contributed positively in trait came from parental strain L02-02D. The latter had the highest yield rate.
|
310 |
Imputação múltipla: comparação e eficiência em experimentos multiambientais / Multiple Imputations: comparison and efficiency of multi-environmental trialsSilva, Maria Joseane Cruz da 19 July 2012 (has links)
Em experimentos de genótipos ambiente são comuns à presença de valores ausentes, devido à quantidade insuficiente de genótipos para aplicação dificultando, por exemplo, o processo de recomendação de genótipos mais produtivos, pois para a aplicação da maioria das técnicas estatísticas multivariadas exigem uma matriz de dados completa. Desta forma, aplicam-se métodos que estimam os valores ausentes a partir dos dados disponíveis conhecidos como imputação de dados (simples e múltiplas), levando em consideração o padrão e o mecanismo de dados ausentes. O objetivo deste trabalho é avaliar a eficiência da imputação múltipla livre da distribuição (IMLD) (BERGAMO et al., 2008; BERGAMO, 2007) comparando-a com o método de imputação múltipla com Monte Carlo via cadeia de Markov (IMMCMC), na imputação de unidades ausentes presentes em experimentos de interação genótipo (25) ambiente (7). Estes dados são provenientes de um experimento aleatorizado em blocos com a cultura de Eucaluptus grandis (LAVORANTI, 2003), os quais foram feitas retiradas de porcentagens aleatoriamente (10%, 20%, 30%) e posteriormente imputadas pelos métodos considerados. Os resultados obtidos por cada método mostraram que, a eficiência relativa em ambas as porcentagens manteve-se acima de 90%, sendo menor para o ambiente (4) quando imputado com a IMLD. Para a medida geral de exatidão, a medida que ocorreu acréscimo de dados em falta, foi maior ao imputar os valores ausentes com a IMMCMC, já para o método IMLD estes valores variaram sendo menor a 20% de retirada aleatória. Dentre os resultados encontrados, é de suma importância considerar o fato de que o método IMMCMC considera a suposição de normalidade, já o método IMLD leva vantagem sobre este ponto, pois não considera restrição alguma sobre a distribuição dos dados nem sobre os mecanismos e padrões de ausência. / In trials of genotypes by environment, the presence of absent values is common, due to the quantity of insufficiency of genotype application, making difficult for example, the process of recommendation of more productive genotypes, because for the application of the majority of the multivariate statistical techniques, a complete data matrix is required. Thus, methods that estimate the absent values from available data, known as imputation of data (simple and multiple) are applied, taking into consideration standards and mechanisms of absent data. The goal of this study is to evaluate the efficiency of multiple imputations free of distributions (IMLD) (BERGAMO et al., 2008; BERGAMO, 2007), compared with the Monte Carlo via Markov chain method of multiple imputation (IMMCMC), in the absent units present in trials of genotype interaction (25)environment (7). This data is provisional of random tests in blocks with Eucaluptus grandis cultures (LAVORANTI, 2003), of which random percentages of withdrawals (10%, 20%, 30%) were performed, with posterior imputation of the considered methods. The results obtained for each method show that, the relative efficiency in both percentages were maintained above 90%, being less for environmental (4) when imputed with an IMLD. The general measure of exactness, the measures where higher absent data occurred, was larger when absent values with an IMMCMC was imputed, as for the IMLD method, the varied absent values were lower at 20% for random withdrawals. Among results found, it is of sum importance to take into consideration the fact that the IMMCMC method considers it to be an assumption of normality, as for the IMLD method, it does not consider any restriction on the distribution of data, not on mechanisms and absent standards, which is an advantage on imputations.
|
Page generated in 0.0616 seconds