1 |
Statistical Methods for Multi-type Recurrent Event Data Based on Monte Carlo EM Algorithms and Copula FrailtiesBedair, Khaled Farag Emam 01 October 2014 (has links)
In this dissertation, we are interested in studying processes which generate events repeatedly over the follow-up time of a given subject. Such processes are called recurrent event processes and the data they provide are referred to as recurrent event data. Examples include the cancer recurrences, recurrent infections or disease episodes, hospital readmissions, the filing of warranty claims, and insurance claims for policy holders. In particular, we focus on the multi-type recurrent event times which usually arise when two or more different kinds of events may occur repeatedly over a period of observation. Our main objectives are to describe features of each marginal process simultaneously and study the dependence among different types of events. We present applications to a real dataset collected from the Nutritional Prevention of Cancer Trial. The objective of the clinical trial was to evaluate the efficacy of Selenium in preventing the recurrence of several types of skin cancer among 1312 residents of the Eastern United States.
Four chapters are involved in this dissertation. Chapter 1 introduces a brief background to the statistical techniques used to develop the proposed methodology. We cover some concepts and useful functions related to survival data analysis and present a short introduction to frailty distributions. The Monte Carlo expectation maximization (MCEM) algorithm and copula functions for the multivariate variables are also presented in this chapter.
Chapter 2 develops a multi-type recurrent events model with multivariate Gaussian random effects (frailties) for the intensity functions. In this chapter, we present nonparametric baseline intensity functions and a multivariate Gaussian distribution for the multivariate correlated random effects. An MCEM algorithm with MCMC routines in the E-step is adopted for the partial likelihood to estimate model parameters. Equations for the variances of the estimates are derived and variances of estimates are computed by Louis' formula. Predictions of the individual random effects are obtained because in some applications the magnitude of the random effects is of interest for a better understanding and interpretation of the variability in the data. The performance of the proposed methodology is evaluated by simulation studies, and the developed model is applied to the skin cancer dataset.
Chapter 3 presents copula-based semiparametric multivariate frailty models for multi-type recurrent event data with applications to the skin cancer data. In this chapter, we generalize the multivariate Gaussian assumption of the frailty terms and allow the frailty distributions to have more features than the symmetric, unimodal properties of the Gaussian density. More flexible approaches to modeling the correlated frailty, referred to as copula functions, are introduced. Copula functions provide tremendous flexibility especially in allowing taking the advantages of a variety of choices for the marginal distributions and correlation structures. Semiparametric intensity models for multi-type recurrent events based on a combination of the MCEM with MCMC sampling methods and copula functions are introduced. The combination of the MCEM approach and copula function is flexible and is a generally applicable approach for obtaining inferences of the unknown parameters for high dimension frailty models. Estimation procedures for fixed effects, nonparametric baseline intensity functions, copula parameters, and predictions for the subject-specific multivariate frailties and random effects are obtained. Louis' formula for variance estimates are derived and calculated. We investigate the impact of the specification of the frailty and random effect models on the inference of covariate effects, cumulative baseline intensity functions, prediction of random effects and frailties, and the estimation of the variance-covariance components. Performances of proposed models are evaluated by simulation studies. Applications are illustrated through the dataset collected from the clinical trial of patients with skin cancer. Conclusions and some remarks for future work are presented in Chapter 4. / Ph. D.
|
2 |
Um modelo multivariado para predição de taxas e proporções dependentesAssis, Alice Nascimento de, 92-99331-6592 09 March 2018 (has links)
Submitted by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2018-05-22T13:53:20Z
No. of bitstreams: 2
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
versaofinal.pdf: 8756608 bytes, checksum: e4b5f21e17776e8f9af04b6752317a59 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2018-05-22T14:16:29Z (GMT) No. of bitstreams: 2
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
versaofinal.pdf: 8756608 bytes, checksum: e4b5f21e17776e8f9af04b6752317a59 (MD5) / Made available in DSpace on 2018-05-22T14:16:29Z (GMT). No. of bitstreams: 2
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
versaofinal.pdf: 8756608 bytes, checksum: e4b5f21e17776e8f9af04b6752317a59 (MD5)
Previous issue date: 2018-03-09 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Relative humidity interferes in many aspects in the life of the human being, and
due to the many consequences that a low or a high percentage can entail, the control of
its level is of paramount importance. Thus, the modeling of extreme situations of this
variable can aid in the planning of human activities that are susceptible to their harmful
effects, such as public health. The main interest is to predict, based on probability density
functions applied to observed data, the values that may occur in a certain locality. The
Generalized Distribution of Extreme Values has been widely used for this purpose and
research using Time Series analysis of meteorological and climatic data. In this work,
a statistical model is proposed for prediction of rates and temporal proportions and/or
spatially dependents. The model was constructed by marginalizing the Kumaraswamy
G-exponentialised distribution conditioned to a random field with positive alpha-stable
distribution. Some properties of this model were presented, procedures for estimation
and inference were discussed and an MCEM algorithm was developed to estimate the
parameters. As a particular case, the model was used for spatial prediction of relative
humidity in weather stations at Amazonas state, Brazil. / A umidade relativa interfere em vários aspectos na vida do ser humano, e devido
as muitas consequências que um baixo ou um alto percentual podem acarretar, o controle
de seu nível é de suma importância. Dessa forma, a modelagem de situações extremas
dessa variável pode auxiliar no planejamento de atividades humanas que sejam suscetíveis
aos seus efeitos danosos, como a saúde pública. O principal interesse é prever com
base em funções densidade de probabilidade aplicadas aos dados observados, os valores
que possam ocorrer em uma certa localidade. A distribuição Generalizada de Valores Extremos
tem sido amplamente utilizada com essa finalidade e pesquisas utilizando análise
de Séries Temporais de dados meteorológicos e climáticos. Neste trabalho, é proposto
um modelo estatístico para predição de taxas e proporções temporais e/ou espacialmente
dependentes. O modelo foi construído através da marginalização da distribuição Kumaraswamy
G-exponencializada condicionada a um campo aleatório com distribuição alfaestável
positivo. Algumas propriedades desse modelo foram apresentadas, procedimentos
para estimação e inferência foram discutidos e um algoritmo MCEM foi desenvolvido parar
estimar os parâmetros. Como um caso particular, o modelo foi utilizado para predição
espacial da umidade relativa do ar observada nas estações meteorológicas do Estado do
Amazonas.
|
3 |
Classification de données multivariées multitypes basée sur des modèles de mélange : application à l'étude d'assemblages d'espèces en écologie / Model-based clustering for multivariate and mixed-mode data : application to multi-species spatial ecological dataGeorgescu, Vera 17 December 2010 (has links)
En écologie des populations, les distributions spatiales d'espèces sont étudiées afin d'inférer l'existence de processus sous-jacents, tels que les interactions intra- et interspécifiques et les réponses des espèces à l'hétérogénéité de l'environnement. Nous proposons d'analyser les données spatiales multi-spécifiques sous l'angle des assemblages d'espèces, que nous considérons en termes d'abondances absolues et non de diversité des espèces. Les assemblages d'espèces sont une des signatures des interactions spatiales locales des espèces entre elles et avec leur environnement. L'étude des assemblages d'espèces peut permettre de détecter plusieurs types d'équilibres spatialisés et de les associer à l'effet de variables environnementales. Les assemblages d'espèces sont définis ici par classification non spatiale des observations multivariées d'abondances d'espèces. Les méthodes de classification basées sur les modèles de mélange ont été choisies afin d'avoir une mesure de l'incertitude de la classification et de modéliser un assemblage par une loi de probabilité multivariée. Dans ce cadre, nous proposons : 1. une méthode d'analyse exploratoire de données spatiales multivariées d'abondances d'espèces, qui permet de détecter des assemblages d'espèces par classification, de les cartographier et d'analyser leur structure spatiale. Des lois usuelles, telle que la Gaussienne multivariée, sont utilisées pour modéliser les assemblages, 2. un modèle hiérarchique pour les assemblages d'abondances lorsque les lois usuelles ne suffisent pas. Ce modèle peut facilement s'adapter à des données contenant des variables de types différents, qui sont fréquemment rencontrées en écologie, 3. une méthode de classification de données contenant des variables de types différents basée sur des mélanges de lois à structure hiérarchique (définies en 2.). Deux applications en écologie ont guidé et illustré ce travail : l'étude à petite échelle des assemblages de deux espèces de pucerons sur des feuilles de clémentinier et l'étude à large échelle des assemblages d'une plante hôte, le plantain lancéolé, et de son pathogène, l'oïdium, sur les îles Aland en Finlande / In population ecology, species spatial patterns are studied in order to infer the existence of underlying processes, such as interactions within and between species, and species response to environmental heterogeneity. We propose to analyze spatial multi-species data by defining species abundance assemblages. Species assemblages are one of the signatures of the local spatial interactions between species and with their environment. Species assemblages are defined here by a non spatial classification of the multivariate observations of species abundances. Model-based clustering procedures using mixture models were chosen in order to have an estimation of the classification uncertainty and to model an assemblage by a multivariate probability distribution. We propose : 1. An exploratory tool for the study of spatial multivariate observations of species abundances, which defines species assemblages by a model-based clustering procedure, and then maps and analyzes the spatial structure of the assemblages. Common distributions, such as the multivariate Gaussian, are used to model the assemblages. 2. A hierarchical model for abundance assemblages which cannot be modeled with common distributions. This model can be easily adapted to mixed mode data, which are frequent in ecology. 3. A clustering procedure for mixed-mode data based on mixtures of hierarchical models. Two ecological case-studies guided and illustrated this work: the small-scale study of the assemblages of two aphid species on leaves of Citrus trees, and the large-scale study of the assemblages of a host plant, Plantago lanceolata, and its pathogen, the powdery mildew, on the Aland islands in south-west Finland
|
Page generated in 0.0549 seconds