401 |
Contributions à la description de signaux, d'images et de volumes par l'approche probabiliste et statistiqueAlata, Olivier 04 October 2010 (has links) (PDF)
Les éléments principaux apparaissant dans ce document de synthèse sont les suivants : - La mise en exergue de la pertinence du critère d'information $\phi_\beta$ qui offre la possibilité d'être ``réglé'' par apprentissage de $\beta$ et cela quelque soit le problème de sélection de modèles pour lequel il est possible d'écrire un critère d'information, possibilité qui a été illustrée dans divers contextes applicatifs (supports de prédiction linéaire et dimension du modèle utilisé pour les cinétiques de $\dot VO_2$). - Une méthode d'estimation d'histogrammes pour décrire de manière non-paramé-trique la distribution d'échantillons et son utilisation en reconnaissance de lois supervisée dans un contexte de canaux de transmission. \item Une méthode dite ``comparative descendante'' permettant de trouver la meilleure combinaison des paramètres pour décrire les données étudiées sans avoir à tester toutes les combinaisons, illustrée sur l'obtention de supports de prédiction linéaire 1-d et 2-d. - La mise en place de stratégies de choix de modèles par rapport à des contextes variés comme l'imagerie TEP et les lois de mélange de Gauss et de Poisson ou les espaces couleur et les lois de mélange gaussiennes multidimensionnelles. - L'exploration des modèles de prédiction linéaire vectorielle complexe sur les images représentées dans des espaces couleur séparant l'intensité lumineuse de la partie chromatique et l'usage qui peut en être fait en caractérisation de textures afin de les classifier ou de segmenter les images texturées couleur. \item Des apports en segmentation : optimisation d'une méthode de segmentation non-supervisée d'images texturées en niveaux de gris ; une nouvelle méthode supervisée de segmentation d'images texturées couleur exploitant les espaces couleur psychovisuels et les erreurs de prédiction linéaire vectorielle complexe ; prise en compte dans des distributions de Gibbs d'informations géométriques et topologiques sur le champ des régions afin de réaliser de la segmentation 3-d ``haut-niveau'' exploitant le formalisme des processus ponctuels. - L'illustration des méthodes MCMC dans des contextes divers comme l'estimation de paramètres, l'obtention de segmentations 2-d ou 3-d ou la simulation de processus. Et beaucoup d'autres éléments se révèleront à sa lecture ...
|
402 |
Modélisation de la variabilité inter-individuelle dans les modèles de croissance de plantes et sélection de modèles pour la prévisionBaey, Charlotte 28 February 2014 (has links) (PDF)
La modélisation de la croissance des plantes a vu le jour à la fin du XXème siècle, à l'intersection de trois disciplines : l'agronomie, la botanique et l'informatique. Après un premier élan qui a donné naissance à un grand nombre de modèles, un deuxième courant a vu le jour au cours de la dernière décennie pour donner à ces modèles un formalisme mathématique et statistique. Les travaux développés dans cette thèse s'inscrivent dans cette démarche et proposent deux axes de développement, l'un autour de l'évaluation et de la comparaison de modèles, et l'autre autour de l'étude de la variabilité inter-plantes. Dans un premier temps, nous nous sommes intéressés à la capacité prédictive des modèles de croissance de plantes, en appliquant une méthodologie permettant de construire et d'évaluer des modèles qui seront utilisés comme outils prédictifs. Une première étape d'analyse de sensibilité permet d'identifier les paramètres les plus influents afin d'élaborer une version plus robuste de chaque modèle, puis les capacités prédictives des modèles sont comparées à l'aide de critères appropriés. %Cette étude a été appliquée au cas de la betterave sucrière. La deuxième partie de la thèse concerne la prise en compte de la variabilité inter-individuelle dans les populations de plantes. %Il existe en effet une forte variabilité entre plantes, d'origine génétique ou environnementale, dont il est nécessaire de tenir compte. Nous proposons dans cette thèse une approche basée sur l'utilisation de modèles (non linéaires) à effets mixtes pour caractériser cette variabilité. L'estimation paramétrique par maximum de vraisemblance nécessite l'utilisation de versions stochastiques de l'algorithme d'Espérance Maximisation basées sur des simulations de type Monte Carlo par Chaîne de Markov. Après une première application au cas de l'organogenèse chez la betterave sucrière, nous proposons une extension du modèle structure-fonction Greenlab à l'échelle de la population.%, appliqué aux cas de la betterave sucrière et du colza.
|
403 |
Modelos não lineares truncados mistos para locação e escalaParaiba, Carolina Costa Mota 14 January 2015 (has links)
Made available in DSpace on 2016-06-02T20:04:53Z (GMT). No. of bitstreams: 1
6714.pdf: 1130315 bytes, checksum: 4ce881df9c6c0f6451cae6908855d277 (MD5)
Previous issue date: 2015-01-14 / Financiadora de Estudos e Projetos / We present a class of nonlinear truncated mixed-effects models where the truncation nature of the data is incorporated into the statistical model by assuming that the variable of interest, namely the truncated variable, follows a truncated distribution which, in turn, corresponds to a conditional distribution obtained by restricting the support of a given probability distribution function. The family of nonlinear truncated mixed-effects models for location and scale is constructed based on the perspective of nonlinear generalized mixed-effects models and by assuming that the distribution of response variable belongs to a truncated class of distributions indexed by a location and a scale parameter. The location parameter of the response variable is assumed to be associated with a continuous nonlinear function of covariates and unknown parameters and with unobserved random effects, and the scale parameter of the responses is assumed to be characterized by a continuous function of the covariates and unknown parameters. The proposed truncated nonlinear mixed-effects models are constructed assuming both random truncation limits; however, truncated nonlinear mixed-effects models with fixed known limits are readily obtained as particular cases of these models. For models constructed under the assumption of random truncation limits, the likelihood function of the observed data shall be a function both of the parameters of the truncated distribution of the truncated variable and of the parameters of the distribution of the truncation variables. For the particular case of fixed known truncation limits, the likelihood function of the observed data is a function only of the parameters of the truncated distribution assumed for the variable of interest. The likelihood equation resulting from the proposed truncated nonlinear regression models do not have analytical solutions and thus, under the frequentist inferential perspective, the model parameters are estimated by direct maximization of the log-likelihood using an iterative procedure. We also consider diagnostic analysis to check for model misspecification, outliers and influential observations using standardized residuals, and global and local influence metrics. Under the Bayesian perspective of statistical inference, parameter estimates are computed based on draws from the posterior distribution of parameters obtained using an Markov Chain Monte Carlo procedure. Posterior predictive checks, Bayesian standardized residuals and a Bayesian influence measures are considered to check for model adequacy, outliers and influential observations. As Bayesian model selection criteria, we consider the sum of log -CPO and a Bayesian model selection procedure using a Bayesian mixture model framework. To illustrate the proposed methodology, we analyze soil-water retention, which are used to construct soil-water characteristic curves and which are subject to truncation since soil-water content (the proportion of water in soil samples) is limited by the residual soil-water content and the saturated soil-water content. / Neste trabalho, apresentamos uma classe de modelos não lineares truncados mistos onde a característica de truncamento dos dados é incorporada ao modelo estatístico assumindo-se que a variável de interesse, isto é, a variável truncada, possui uma função de distribuição truncada que, por sua vez, corresponde a uma função de distribuição condicional obtida ao se restringir o suporte de alguma função de distribuição de probabilidade. A família de modelos não lineares truncados mistos para locação e escala é construída sob a perspectiva de modelos não lineares generalizados mistos e considerando uma classe de distribuições indexadas por parâmetros de locação e escala. Assumimos que o parâmetro de locação da variável resposta é associado a uma função não linear contínua de um conjunto de covariáveis e parâmetros desconhecidos e a efeitos aleatórios não observáveis, e que o parâmetro de escala das respostas pode ser caracterizado por uma função contínua das covariáveis e de parâmetros desconhecidos. Os modelos não lineares truncados mistos para locação e escala, aqui apresentados, são construídos supondo limites de truncamento aleatórios, porém, modelos não lineares truncados mistos com limites fixos e conhecidos são prontamente obtidos como casos particulares desses modelos. Nos modelos construídos sob a suposição de limites de truncamentos aleatórios, a função de verossimilhança é escrita em função dos parâmetros da distribuição da variável resposta truncada e dos parâmetros das distribuições das variáveis de truncamento. Para o caso particular de limites fixos e conhecidos, a função de verossimilhança será apenas uma função dos parâmetros da distribuição truncada assumida para a variável resposta de interesse. As equações de verossimilhança dos modelos, aqui propostos, não possuem soluções analíticas e, sob a perspectiva frequentista de inferência estatística, os parâmetros do modelo são estimados pela maximização direta da função de log-verossimilhança via um procedimento iterativo. Consideramos, também, uma análise de diagnóstico para verificar a adequação do modelo, observações discrepantes e/ou influentes, usando resíduos padronizados e medidas de influência global e influência local. Sob a perspectiva Bayesiana de inferência estatística, as estimativas dos parâmetros dos modelos propostos são definidas como as médias a posteriori de amostras obtidas via um algoritmo do tipo cadeia de Markov Monte Carlo das distribuições a posteriori dos parâmetros. Para a análise de diagnóstico Bayesiano do modelo, consideramos métricas de avaliação preditiva a posteriori, resíduos Bayesianos padronizados e a calibração de casos para diagnóstico de influência. Como critérios Bayesianos de seleção de modelos, consideramos a soma de log -CPO e um critério de seleção de modelos baseada na abordagem Bayesiana de mistura de modelos. Para ilustrar a metodologia proposta, analisamos dados de retenção de água em solo, que são usados para construir curvas de retenção de água em solo e que estão sujeitos a truncamento pois as medições de umidade de água (a proporção de água presente em amostras de solos) são limitadas pela umidade residual e pela umidade saturada do solo amostrado.
|
404 |
Stochastic Modelling of Vehicle-Structure Interactions : Dynamic State And Parameter Estimation, And Global Response Sensitivity AnalysisAbhinav, S January 2016 (has links) (PDF)
The analysis of vehicle-structure interaction systems plays a significant role in the design and maintenance of bridges. In recent years, the assessment of the health of existing bridges and the design of new ones has gained significance, in part due to the progress made in the development of faster moving locomotives, the desire for lighter bridges, and the imposition of performance criteria against rare events such as occurrence of earthquakes and fire. A probabilistic analysis would address these issues, and also assist in determination of reliability and in estimating the remaining life of the structure. In this thesis, we aim to develop tools for the probabilistic analysis techniques of state estimation, parameter identification and global response sensitivity analysis of vehicle-structure interaction systems, which are also applicable to the broader class of structural dynamical systems. The thesis is composed of six chapters and three appendices. The contents of these chapters and the appendices are described in brief in the following paragraphs.
In chapter 1, we introduce the problem of probabilistic analysis of vehicle-structure interactions. The introduction is organized in three parts, dealing separately with issues of forward problems, inverse problems, and global response sensitivity analysis. We begin with an overview of the modelling and analysis of vehicle-structure interaction systems, including the application of spatial substructuring and mesh partitioning schemes. Following this, we describe Bayesian techniques for state and parameter estimation for the general class of state-space models of dynamical systems, including the application of the Kalman filter and particle filters for state estimation, MCMC sampling based filters for parameter identification, and the extended Kalman filter, the unscented Kalman filter and the ensemble Kalman filter for the problem of combined state and parameter identification. In this context, we present the Rao-Blackwellization method which leads to variance reduction in particle filtering. Finally, we present the techniques of global response sensitivity analysis, including Sobol’s analysis and distance-based measures of sensitivity indices. We provide an outline and a review of literature on each of these topics. In our review of literature, we identify the difficulties encountered when adopting these tools to problems involving vehicle-structure interaction systems, and corresponding to these issues, we identify some open problems for research. These problems are addressed in chapters 2, 3, 4 and 5.
In chapter 2, we study the application of finite element modelling, combined with numerical solutions of governing stochastic differential equations, to analyse instrumented nonlinear moving vehicle-structure systems. The focus of the chapter is on achieving computational efficiency by deploying, within a single modeling framework, three sub structuring schemes with different methodological moorings. The schemes considered include spatial substructuring schemes (involving free-interface coupling methods), a spatial mesh partitioning scheme for governing stochastic differential equations (involving the use of a predictor corrector method with implicit integration schemes for linear regions and explicit schemes for local nonlinear regions), and application of the Rao-Blackwellization scheme (which permits the use of Kalman’s filtering for linear substructures and Monte Carlo filters for nonlinear substructures). The main effort in this work is expended on combining these schemes with provisions for interfacing of the substructures by taking into account the relative motion of the vehicle and the supporting structure. The problem is formulated with reference to an archetypal beam and multi-degrees of freedom moving oscillator with spatially localized nonlinear characteristics. The study takes into account imperfections in mathematical modelling, guide way unevenness, and measurement noise. The numerical results demonstrate notable reduction in computational effort achieved on account of introduction of the substructuring schemes.
In chapter 3, we address the issue of identification of system parameters of structural systems using dynamical measurement data. When Markov chain Monte Carlo (MCMC) samplers are used in problems of system parameter identification, one would face computational difficulties in dealing with large amount of measurement data and (or) low levels of measurement noise. Such exigencies are likely to occur in problems of parameter identification in dynamical systems when amount of vibratory measurement data and number of parameters to be identified could be large. In such cases, the posterior probability density function of the system parameters tends to have regions of narrow supports and a finite length MCMC chain is unlikely to cover pertinent regions. In this chapter, strategies are proposed based on modification of measurement equations and subsequent corrections, to alleviate this difficulty. This involves artificial enhancement of measurement noise, assimilation of transformed packets of measurements, and a global iteration strategy to improve the choice of
prior models. Illustrative examples include a laboratory study on a beam-moving trolley system.
In chapter 4, we consider the combined estimation of the system states and parameters of vehicle-structure interaction systems. To this end, we formulate a framework which uses MCMC sampling for parameter estimation and particle filtering for state estimation. In chapters 2 and 3, we described the computational issues faced when adopting these techniques individually. When used together, we come across both sets of issues, and find the complexity of the estimation problem is greatly increased. In this chapter, we address the computational issues by adopting the sub structuring techniques proposed in chapter 2, and the parameter identification method based on modified measurement models presented in chapter 3. The proposed method is illustrated on a computational study on a beam-moving oscillator system with localized nonlinearities, as well as on a laboratory study on a beam-moving trolley system.
In chapter 5, we present global response sensitivity indices for structural dynamical systems with random system parameters excited by multiple random excitations. Two new procedures for evaluating global response sensitivity measures with respect to the excitation components are proposed. The first procedure is valid for stationary response of linear systems under stationary random excitations and is based on the notion of Hellinger’s metric of distance between two power spectral density functions. The second procedure is more generally valid and is based on the l2 norm based distance measure between two probability density functions. Specific cases which admit exact solutions are presented and solution procedures based on Monte Carlo simulations for more general class of problems are outlined. The applicability of the proposed procedures to the case of random system parameters is demonstrated using suitable illustrations. Illustrations include studies on a parametrically excited linear system and a nonlinear random vibration problem involving moving oscillator-beam system that considers excitations due to random support motions and guide-way unevenness.
In chapter 6 we summarize the contributions made in chapters 2, 3, 4, and 5, and on the basis of these studies, present a few problems for future research.
In addition to these chapters, three appendices are included in this thesis. Appendices A and B correspond to chapter 3. In appendix A, we study the effect on the nature of the posterior probability density functions of large measurement data set and small measurement noise. Appendix B illustrates the MCMC sampling based parameter estimation procedure of chapter 3 using a laboratory study on a bending–torsion coupled, geometrically non-linear building frame under earthquake support motion. In appendix C, we present Ito-Taylor time discretization schemes for stochastic delay differential equations found in chapter 5.
|
405 |
Modèle d'agrégation des avis des experts, en fiabilité d'équipementsHandi, Youssef January 2021 (has links) (PDF)
No description available.
|
406 |
Hamiltonian Monte Carlo and consistent sampling for score matching based generative modelingPiché-Taillefer, Rémi 05 1900 (has links)
Avant-propos: Cet ouvrage se base en partie sur le travail réalisé en collaboration avec Alexia Jolicoeur-Martineau, Ioannis Mitliagkas et Rémi Tachet des Combes, réalisé en 2020 et publié à la conférence internationale d'apprentissage de représentations (ICLR 2021). Les analyses présentées dans les prochaines pages approfondissent, corrigent et ajoutent à cet ouvrage de manière substantive, sans toutefois reposer sur cet ouvrage ou quelconque connaissance couverte par ce texte. / Ce mémoire a pour but de présenter des analyses pertinentes au sujet des méthodes génératives dites Denoising Score Matching dans le but de mieux comprendre leur fonctionnement et d'améliorer les techniques existantes. Ces méthodes consistent à graduellement réduire le bruit dans une image en usant de réseaux neuraux profonds à des fins de synthèse. Tandis que les premiers chapitres contextualisent le problème du Denoising Score Matching, les chapitres suivants s’affairent à reformuler l’objectif d’entraînement du réseau neuronal, puis à analyser le processus itératif générateur. J’introduis par la suite les concepts fondateurs des méthodes de Monte Carlo par chaînes de Markov (MCMC) pour dynamiques Hamiltoniennes, que j’adapte ensuite à la synthèse d’image par réduction graduelle de bruit. Tandis que les dynamiques de Langevin ont jusqu’alors eut monopole des processus génératifs dans la littérature de synthèse par le score, les dynamiques Hamiltoniennes font l'objet d’un engouement quant à leur vitesse de convergence supérieure. Je démontre leur efficacité dans les sections suivantes et précise, dans le cas de la génération d'images complexes, les contextes dans lesquels leur usage est avantageux. Lors d’une étude d’ablation complète, je présente les gains indépendants et jumelés des améliorations proposées, et par le fait même, je contribue à notre compréhension des modèles basés sur le score. / This thesis presents pertinent analysis around generative modeling of the Denoising Score Matching family with the goals of better understanding how they work and improving existing methods. These methods work by gradually reducing noise in images using deep neural networks. While the first chapters contextualize the problem of Denoising Score Matching, the following chapters focus on reformulating the training objective of the neural network and analysing the iterative generative process. I introduce the founding concepts of Markov Chain Monte Carlo (MCMC) for Hamiltonian Dynamics and adapt them to our framework of image synthesis by annealing of Gaussian noise. While Langevin Dynamics have thus far dominated generative processes in the Denoising Score Matching literature, Hamiltonian Dynamics sustained interest from their superior convergence rate. I demonstrate their efficiency in the next chapters and elaborate on the contexts in which their use is advantageous to complex image generation. In a complete ablation study, I present the independent and coupled gains from every proposed improvements and thereby elevate our comprehension of Denoising Score Matching methods.
|
407 |
Learning Sampling-Based 6D Object Pose EstimationKrull, Alexander 31 August 2018 (has links)
The task of 6D object pose estimation, i.e. of estimating an object position (three degrees of freedom) and orientation (three degrees of freedom) from images is an essential building block of many modern applications, such as robotic grasping, autonomous driving, or augmented reality. Automatic pose estimation systems have to overcome a variety of visual ambiguities, including texture-less objects, clutter, and occlusion. Since many applications demand real time performance the efficient use of computational resources is an additional challenge.
In this thesis, we will take a probabilistic stance on trying to overcome said issues. We build on a highly successful automatic pose estimation framework based on predicting pixel-wise correspondences between the camera coordinate system and the local coordinate system of the object. These dense correspondences are used to generate a pool of hypotheses, which in turn serve as a starting point in a final search procedure. We will present three systems that each use probabilistic modeling and sampling to improve upon different aspects of the framework.
The goal of the first system, System I, is to enable pose tracking, i.e. estimating the pose of an object in a sequence of frames instead of a single image. By including information from previous frames tracking systems can resolve many visual ambiguities and reduce computation time. System I is a particle filter (PF) approach. The PF represents its belief about the pose in each frame by propagating a set of samples through time. Our system uses the process of hypothesis generation from the original framework as part of a proposal distribution that efficiently concentrates samples in the appropriate areas.
In System II, we focus on the problem of evaluating the quality of pose hypotheses. This task plays an essential role in the final search procedure of the original framework. We use a convolutional neural network (CNN) to assess the quality of an hypothesis by comparing rendered and observed images. To train the CNN we view it as part of an energy-based probability distribution in pose space. This probabilistic perspective allows us to train the system under the maximum likelihood paradigm. We use a sampling approach to approximate the required gradients. The resulting system for pose estimation yields superior results in particular for highly occluded objects.
In System III, we take the idea of machine learning a step further. Instead of learning to predict an hypothesis quality measure, to be used in a search procedure, we present a way of learning the search procedure itself. We train a reinforcement learning (RL) agent, termed PoseAgent, to steer the search process and make optimal use of a given computational budget. PoseAgent dynamically decides which hypothesis should be refined next, and which one should ultimately be output as final estimate. Since the search procedure includes discrete non-differentiable choices, training of the system via gradient descent is not easily possible. To solve the problem, we model behavior of PoseAgent as non-deterministic stochastic policy, which is ultimately governed by a CNN. This allows us to use a sampling-based stochastic policy gradient training procedure.
We believe that some of the ideas developed in this thesis,
such as the sampling-driven probabilistically motivated training of a CNN for the comparison of images or the search procedure implemented by PoseAgent have the potential to be applied in fields beyond pose estimation as well.
|
408 |
Medical relevance and functional consequences of protein truncating variantsRivas Cruz, Manuel A. January 2015 (has links)
Genome-wide association studies have greatly improved our understanding of the contribution of common variants to the genetic architecture of complex traits. However, two major limitations have been highlighted. First, common variant associations typically do not identify the causal variant and/or the gene that it is exerting its effect on to influence a trait. Second, common variant associations usually consist of variants with small effects. As a consequence, it is more challenging to harness their translational impact. Association studies of rare variants and complex traits may be able to help address these limitations. Empirical population genetic data shows that deleterious variants are rare. More specifically, there is a very strong depletion of common protein truncating variants (PTVs, commonly referred to as loss-of-function variants) in the genome, a group of variants that have been shown to have large effect on gene function, are enriched for severe disease-causing mutations, but in other instances may actually be protective against disease. This thesis is divided into three parts dedicated to the study of protein truncating variants, their medical relevance, and their functional consequences. First, I present statistical, bioinformatic, and computational methods developed for the study of protein truncating variants and their association to complex traits, and their functional consequences. Second, I present application of the methods to a number of case-control and quantitative trait studies discovering new variants and genes associated to breast and ovarian cancer, type 1 diabetes, lipids, and metabolic traits measured with NMR spectroscopy. Third, I present work on improving annotation of protein truncating variants by studying their functional consequences. Taken together, these results highlight the utility of interrogating protein truncating variants in medical and functional genomic studies.
|
409 |
Contraindre l'équation d'état de la matière à densité supranucléaire à partir des sursauts X des étoiles à neutronsArtigue, Romain 20 November 2013 (has links) (PDF)
Cette thèse est consacrée à l'étude des oscillations périodiques détectées lors des sursauts X des étoiles à neutrons, dans des binaires X de faible masse. Ces oscillations offrent un moyen de sonder l'intérieur de ces objets, en mesurant notamment leur masse et leur rayon, pour ainsi contraindre l'équation d'état de la matière dense. J'ai développé des méthodes de détection et d'analyse de ces signaux, de leurs propriétés temporelles et leur dépendance en énergie. J'ai analysé les oscillations détectées dans tous les sursauts X de type 1 (ainsi qu'un super-sursaut) de 3 étoiles à neutrons observées avec l'instrument Rossi X-ray Timing Explorer/Proportional Counter Array . Sur les courbes de lumière des sursauts, j'ai sélectionné les segments donnant la meilleure signification statistique, pour construire un catalogue de profils moyens d'oscillations. La forme des profils varie grandement d'un sursaut à un autre, pour une même source. Un grand nombre de paramètres peuvent affecter les oscillations. J'ai élaboré un modèle de tache chaude à la surface de l'étoile en rotation rapide pour caractériser l'émission du sursaut X, dans un espace-temps relativiste. En utilisant les chaînes de Markov Monte Carlo pour explorer efficacement un espace des paramètres conséquent, les ajustements sur un échantillon de sursauts ont démontré l'applicabilité du modèle. Par contre, les contraintes obtenues sur la masse et le rayon de l'étoile sont limitées par la qualité des données de l'instrument utilisé. Enfin, des simulations révèlent que des mesures précises sur les paramètres sont possibles en augmentant la surface collectrice des détecteurs, comme le proposent les observatoires X du futur.
|
410 |
空間相關存活資料之貝氏半參數比例勝算模式 / Bayesian semiparametric proportional odds models for spatially correlated survival data張凱嵐, Chang, Kai lan Unknown Date (has links)
近來地理資訊系統(GIS)之資料庫受到不同領域的統計學家廣泛的研究,以期建立及分析可描述空間聚集效應及變異之模型,而描述空間相關存活資料之統計模式為公共衛生及流行病學上新興的研究議題。本文擬建立多維度半參數的貝氏階層模型,並結合空間及非空間隨機效應以描述存活資料中的空間變異。此模式將利用多變量條件自回歸(MCAR)模型以檢驗在不同地理區域中是否存有空間聚集效應。而基準風險函數之生成為分析貝氏半參數階層模型的重要步驟,本研究將利用混合Polya樹之方式生成基準風險函數。美國國家癌症研究院之「流行病監測及最終結果」(Surveillance Epidemiology and End Results, SEER)資料庫為目前美國最完整的癌症病人長期追蹤資料,包含癌症病人存活狀況、多重癌症史、居住地區及其他分析所需之個人資料。本文將自此資料庫擷取美國愛荷華州之癌症病人資料為例作實證分析,並以貝氏統計分析中常用之模型比較標準如條件預測指標(CPO)、平均對數擬邊際概似函數值(ALMPL)、離差訊息準則(DIC)分別測試其可靠度。 / The databases of Geographic Information System (GIS) have gained attention among different fields of statisticians to develop and analyze models which account for spatial clustering and variation. There is an emerging interest in modeling spatially correlated survival data in public health and epidemiologic studies. In this article, we develop Bayesian multivariate semiparametric hierarchical models to incorporate both spatially correlated and uncorrelated frailties to answer the question of spatial variation in the survival patterns, and we use multivariate conditionally autoregressive (MCAR) model to detect that whether there exists the spatial cluster across different areas. The baseline hazard function will be modeled semiparametrically using mixtures of finite Polya trees. The SEER (Surveillance Epidemiology and End Results) database from the National Cancer Institute (NCI) provides comprehensive cancer data about patient’s survival time, regional information, and others demographic information. We implement our Bayesian hierarchical spatial models on Iowa cancer data extracted from SEER database. We illustrate how to compute the conditional predictive ordinate (CPO), the average log-marginal pseudo-likelihood (ALMPL), and deviance information criterion (DIC), which are Bayesian criterions for model checking and comparison among competing models.
|
Page generated in 0.0315 seconds