Global ETD Search

101	Statistické úlohy pro Markovské procesy se spojitým časem / Statistical inference for Markov processes with continuous time Křepinská, Dana January 2014 (has links) Tato diplomová práce se zabývá odhadováním matice intenzit Markovova pro- cesu se spojitým časem na základě diskrétně pozorovaných dat. Začátek práce je věnován jednoduššímu odhadu ze spojité trajektorie pomocí metody maximální věrohodnosti. Dále je zde popsán odhad z diskrétní trajektorie přes výpočet ma- tice pravděpodobností přechodu. Následně je velmi podrobně rozebrán EM al- goritmus, který předchozí odhad zpřesňuje. Na závěr teoretické části je uvedena metoda odhadu zvaná Monte Carlo Markov Chain. Všechny postupy jsou zároveň implementovány v počítačovém softwaru a prezentace jejich výsledk· je obsahem druhé části práce. V té jsou porovnané odhady pro denní, týdenní a měsíční po- zorování a také pro pětiletou a desetiletou pozorovanou trajektorii. K výsledk·m jsou připojeny odhady rozptyl· a intervaly spolehlivosti. 1
102	Methods and algorithms to learn spatio-temporal changes from longitudinal manifold-valued observations / Méthodes et algorithmes pour l’apprentissage de modèles d'évolution spatio-temporels à partir de données longitudinales sur une variété Schiratti, Jean-Baptiste 23 January 2017 (has links) Dans ce manuscrit, nous présentons un modèle à effets mixtes, présenté dans un cadre Bayésien, permettant d'estimer la progression temporelle d'un phénomène biologique à partir d'observations répétées, à valeurs dans une variété Riemannienne, et obtenues pour un individu ou groupe d'individus. La progression est modélisée par des trajectoires continues dans l'espace des observations, que l'on suppose être une variété Riemannienne. La trajectoire moyenne est définie par les effets mixtes du modèle. Pour définir les trajectoires de progression individuelles, nous avons introduit la notion de variation parallèle d'une courbe sur une variété Riemannienne. Pour chaque individu, une trajectoire individuelle est construite en considérant une variation parallèle de la trajectoire moyenne et en reparamétrisant en temps cette parallèle. Les transformations spatio-temporelles sujet-spécifiques, que sont la variation parallèle et la reparamétrisation temporelle sont définnies par les effets aléatoires du modèle et permettent de quantifier les changements de direction et vitesse à laquelle les trajectoires sont parcourues. Le cadre de la géométrie Riemannienne permet d'utiliser ce modèle générique avec n'importe quel type de données définies par des contraintes lisses. Une version stochastique de l'algorithme EM, le Monte Carlo Markov Chains Stochastic Approximation EM (MCMC-SAEM), est utilisé pour estimer les paramètres du modèle au sens du maximum a posteriori. L'utilisation du MCMC-SAEM avec un schéma numérique permettant de calculer le transport parallèle est discutée dans ce manuscrit. De plus, le modèle et le MCMC-SAEM sont validés sur des données synthétiques, ainsi qu'en grande dimension. Enfin, nous des résultats obtenus sur différents jeux de données liés à la santé. / We propose a generic Bayesian mixed-effects model to estimate the temporal progression of a biological phenomenon from manifold-valued observations obtained at multiple time points for an individual or group of individuals. The progression is modeled by continuous trajectories in the space of measurements, which is assumed to be a Riemannian manifold. The group-average trajectory is defined by the fixed effects of the model. To define the individual trajectories, we introduced the notion of « parallel variations » of a curve on a Riemannian manifold. For each individual, the individual trajectory is constructed by considering a parallel variation of the average trajectory and reparametrizing this parallel in time. The subject specific spatiotemporal transformations, namely parallel variation and time reparametrization, are defined by the individual random effects and allow to quantify the changes in direction and pace at which the trajectories are followed. The framework of Riemannian geometry allows the model to be used with any kind of measurements with smooth constraints. A stochastic version of the Expectation-Maximization algorithm, the Monte Carlo Markov Chains Stochastic Approximation EM algorithm (MCMC-SAEM), is used to produce produce maximum a posteriori estimates of the parameters. The use of the MCMC-SAEM together with a numerical scheme for the approximation of parallel transport is discussed. In addition to this, the method is validated on synthetic data and in high-dimensional settings. We also provide experimental results obtained on health data. Géométrie Riemannienne Algorithme EM stochastique Données longitudinales Modélisation statistique Riemannian geometry Stochastic EM algorithm Longitudinal data Statistical modeling
103	MULTI-STATE MODELS WITH MISSING COVARIATES Lou, Wenjie 01 January 2016 (has links) Multi-state models have been widely used to analyze longitudinal event history data obtained in medical studies. The tools and methods developed recently in this area require the complete observed datasets. While, in many applications measurements on certain components of the covariate vector are missing on some study subjects. In this dissertation, several likelihood-based methodologies were proposed to deal with datasets with different types of missing covariates efficiently when applying multi-state models. Firstly, a maximum observed data likelihood method was proposed when the data has a univariate missing pattern and the missing covariate is a categorical variable. The construction of the observed data likelihood function is based on the model of a joint distribution of the response longitudinal event history data and the discrete covariate with missing values. Secondly, we proposed a maximum simulated likelihood method to deal with the missing continuous covariate when applying multi-state models. The observed data likelihood function was approximated by using the Monte Carlo simulation method. At last, an EM algorithm was used to deal with multiple missing covariates when estimating the parameters of multi-state model. The EM algorithm would be able to handle multiple missing discrete covariates in general missing pattern efficiently. All the proposed methods are justified by simulation studies and applications to the datasets from the SMART project, a consortium of 11 different high-quality longitudinal studies of aging and cognition. Longitudinal event history data multi-state model missing covariate data EM algorithm maximum simulated likelihood SMART project Applied Statistics Statistical Models
104	Functional clustering methods and marital fertility modelling Arnqvist, Per January 2017 (has links) This thesis consists of two parts.The first part considers further development of a model used for marital fertility, the Coale-Trussell's fertility model, which is based on age-specific fertility rates. A new model is suggested using individual fertility data and a waiting time after pregnancies. The model is named the waiting model and can be understood as an alternating renewal process with age-specific intensities. Due to the complicated form of the waiting model and the way data is presented, as given in the United Nation Demographic Year Book 1965, a normal approximation is suggested together with a normal approximation of the mean and variance of the number of births per summarized interval. A further refinement of the model was then introduced to allow for left truncated and censored individual data, summarized as table data. The waiting model suggested gives better understanding of marital fertility and by a simulation study it is shown that the waiting model outperforms the Coale-Trussell model when it comes to estimating the fertility intensity and to predict the mean and variance of the number of births for a population. The second part of the thesis focus on developing functional clustering methods.The methods are motivated by and applied to varved (annually laminated) sediment data from lake Kassj\"on in northern Sweden. The rich but complex information (with respect to climate) in the varves, including the shapes of the seasonal patterns, the varying varve thickness, and the non-linear sediment accumulation rates makes it non-trivial to cluster the varves. Functional representations, smoothing and alignment are functional data tools used to make the seasonal patterns comparable.Functional clustering is used to group the seasonal patterns into different types, which can be associated with different weather conditions. A new non-parametric functional clustering method is suggested, the Bagging Voronoi K-mediod Alignment algorithm, (BVKMA), which simultaneously clusters and aligns spatially dependent curves. BVKMA is used on the varved lake sediment, to infer on climate, defined as frequencies of different weather types, over longer time periods. Furthermore, a functional model-based clustering method is proposed that clusters subjects for which both functional data and covariates are observed, allowing different covariance structures in the different clusters. The model extends a model-based functional clustering method proposed by James and Suger (2003). An EM algorithm is derived to estimate the parameters of the model. censoring Coale-Trussell model EM-algorithm functional data analysis functional clustering marital fertility normal approximation Poisson process varved lake sediments warping
105	Inferência e diagnósticos em modelos assimétricos / Inference and diagnostics in asymmetric models Ferreira, Clécio da Silva 20 March 2008 (has links) Este trabalho apresenta um estudo de inferência e diagnósticos em modelos assimétricos. A análise de influência é baseada na metodologia para modelos com dados incompletos, que é relacionada ao algoritmo EM (Zhu e Lee, 2001). Além dos modelos de regressão Normal Assimétrico (Azzalini, 1999) e t-Normal Assimétrico (Gómez, Venegas e Bolfarine, 2007) existentes, são desenvolvidas duas novas classes de modelos, denominados modelos de misturas de escala normal assimétricos (englobando as distribuições Normal, t-Normal, Slash, Normal-Contaminada e Exponencial-potência Assimétricas) e modelos lineares mistos robustos assimétricos, utilizando distribuições de misturas de escalas normais assimétricas para o efeito aleatório e distribuições de misturas de escalas para o erro aleatório. Para o modelo misto, a matriz de informação de Fisher observada é calculada utilizando a aproximação de Louis (1982) para dados incompletos. Para todos os modelos, algoritmos tipo EM são desenvolvidos de forma a fornecer uma solução numérica para os parâmetros dos modelos de regressão. Para cada modelo de regressão, medidas de bondade de ajuste são realizadas via inspeção visual do gráfico de envelope simulado. Para os modelos de misturas de escalas normais assimétricos, um estudo de robustez do algoritmo EM proposto é desenvolvido, determinando a eficácia dos estimadores apresentados. Aplicações dos modelos estudados são realizadas para os conjuntos de dados do Australian Institute of Sports (AIS), para o conjunto de dados sobre qualidade de vida de pacientes (mulheres) com câncer de mama, em um estudo realizado pelo Centro de Atenção Integral à Saúde da Mulher (CAISM) em conjunto com a Faculdade de Ciências Médicas, da Universidade Estadual de Campinas e para o conjunto de dados de colesterol de Framingham. / This work presents a study of inference and diagnostic in asymmetric models. The influence analysis is based in the methodology for models with incomplete data, that is related to the algorithm EM (Zhu and Lee, 2001). Beyond of the existing asymmetric normal (Azzalini, 1999) and t-Normal asymmetric (Gómez, Venegas and Bolfarine, 2007) regression models, are developed two new classes of models, namely asymmetric normal scale mixture models (embodying the asymmetric Normal, t-Normal, Slash, Contaminated-Normal and Power-Exponential distributions) and asymmetric robust linear mixed models, utilizing asymmetric normal scale mixture distributions for the random effect and normal scale mixture distributions for the random error. For the mixed model, the observed Fisher information matrix is calculated using the Louis\' (1982) approach for incomplete data. For all models, EM algorithms are developed, that provide a numeric solution for the parameters of the regression models. For each regression model, measures of goodness of fit are realized through visual inspection of the graphic of simulated envelope. For the asymmetric normal scale mixture models, a study of robustness of the proposed EM algorithm is developed to determine the efficacy of the presented estimators. Applications of the studied models are made for the data set of the Australian Institute of Sports (AIS), for the data set about quality of life of patients (women) with breast cancer, in a study made by Centro de Atenção Integral à Saúde da Mulher (CAISM) in conjoint with the Medical Sciences Faculty, of the Campinas State\'s University and for the data set of Framingham\'s cholesterol study. Algoritmo EM. Asymmetric Normal Distributions Distribuições Normais Assimétricas EM Algorithm. Mixed Models Modelos Mistos Scale Mixtures of Normal Distributions
106	Modelos de mistura para dados com distribuições Poisson truncadas no zero / Mixture models for data with zero truncated Poisson distributions Gigante, Andressa do Carmo 22 September 2017 (has links) Modelo de mistura de distribuições tem sido utilizado desde longa data, mas ganhou maior atenção recentemente devido ao desenvolvimento de métodos de estimação mais eficientes. Nesta dissertação, o modelo de mistura foi utilizado como uma forma de agrupar ou segmentar dados para as distribuições Poisson e Poisson truncada no zero. Para solucionar o problema do truncamento foram estudadas duas abordagens. Na primeira, foi considerado o truncamento em cada componente da mistura, ou seja, a distribuição Poisson truncada no zero. E, alternativamente, o truncamento na resultante do modelo de mistura utilizando a distribuição Poisson usual. As estimativas dos parâmetros de interesse do modelo de mistura foram calculadas via metodologia de máxima verossimilhança, sendo necessária a utilização de um método iterativo. Dado isso, implementamos o algoritmo EM para estimar os parâmetros do modelo de mistura para as duas abordagens em estudo. Para analisar a performance dos algoritmos construídos elaboramos um estudo de simulação em que apresentaram estimativas próximas dos verdadeiros valores dos parâmetros de interesse. Aplicamos os algoritmos à uma base de dados real de uma determinada loja eletrônica e para determinar a escolha do melhor modelo utilizamos os critérios de seleção de modelos AIC e BIC. O truncamento no zero indica afetar mais a metodologia na qual aplicamos o truncamento em cada componente da mistura, tornando algumas estimativas para a distribuição Poisson truncada no zero com viés forte. Ao passo que, na abordagem em que empregamos o truncamento no zero diretamente no modelo as estimativas apontaram menor viés. / Mixture models has been used since long but just recently attracted more attention for the estimations methods development more efficient. In this dissertation, we consider the mixture model like a method for clustering or segmentation data with the Poisson and Poisson zero truncated distributions. About the zero truncation problem we have two emplacements. The first, consider the zero truncation in the mixture component, that is, we used the Poisson zero truncated distribution. And, alternatively, we do the zero truncation in the mixture model applying the usual Poisson. We estimated parameters of interest for the mixture model through maximum likelihood estimation method in which we need an iterative method. In this way, we implemented the EM algorithm for the estimation of interested parameters. We apply the algorithm in one real data base about one determined electronic store and towards determine the better model we use the criterion selection AIC and BIC. The zero truncation appear affect more the method which we truncated in the component mixture, return some estimates with strong bias. In the other hand, when we truncated the zero directly in the model the estimates pointed less bias. Algoritmo EM adaptado Clustering methods EM algorithm Métodos de agrupamento ou segmentação Mistura de Poissons truncadas no zero Mixture model Modelo de mistura Variável truncada no zero Zero truncated Poissons mixture
107	Inferência em modelos de mistura via algoritmo EM estocástico modificado / Inference on Mixture Models via Modified Stochastic EM Assis, Raul Caram de 02 June 2017 (has links) Apresentamos o tópico e a teoria de Modelos de Mistura de Distribuições, revendo aspectos teóricos e interpretações de tais misturas. Desenvolvemos a teoria dos modelos nos contextos de máxima verossimilhança e de inferência bayesiana. Abordamos métodos de agrupamento já existentes em ambos os contextos, com ênfase em dois métodos, o algoritmo EM estocástico no contexto de máxima verossimilhança e o Modelo de Mistura com Processos de Dirichlet no contexto bayesiano. Propomos um novo método, uma modificação do algoritmo EM Estocástico, que pode ser utilizado para estimar os parâmetros de uma mistura de componentes enquanto permite soluções com número distinto de grupos. / We present the topics and theory of Mixture Models in a context of maximum likelihood and Bayesian inferece. We approach clustering methods in both contexts, with emphasis on the stochastic EM algorithm and the Dirichlet Process Mixture Model. We propose a new method, a modified stochastic EM algorithm, which can be used to estimate the parameters of a mixture model and the number of components. Algoritmo EM Cadeia de Markov EM algorithm Gibbs sampling Gibbs Sampling Image segmentation Markov chain Mistura de Distribuições Mixture models Mixture of distributions Modelos de Mistura Segmentação de imagens
108	Análise dos resultados de ensaios de proficiência via modelos de regressão com variável explicativa aleatória / Analysis of proficiency tests results via regression models with random explanatory variable Montanari, Aline Othon 21 June 2004 (has links) Em um programa de ensaio de prociência (EP) conduzido pelo Grupo de Motores, um grupo de onze laboratórios da área de temperatura realizaram medições em cinco pontos da escala de um termopar. Neste trabalho, propomos um modelo de regressão com variável explicativa X (aleatória) representando o termopar padrão que denominaremos por artefato e a variável dependente Y representando as medições dos laboratórios. O procedimento para a realização da comparação é simples, ambos termopares são colocados no forno e as diferenças entre as medições são registradas. Para a análise dos dados, vamos trabalhar com a diferença entre a diferença das medições do equipamento do laboratório e o artefato, e o valor de referência (que é determinado por 2 laboratórios que pertencem a Rede Brasileira de Calibração (RBC)). O erro de medição tem variância determinada por calibração, isto é, conhecida. Assim, vamos encontrar aproximações para as estimativas de máxima verossimilhança para os parâmetros do modelo via algoritmo EM. Além disso, propomos uma estratégia para avaliar a consistência dos laboratórios participantes do programa de EP / In a program of proficiency assay, a group of eleven laboratories of the temperature area had carried through measurements in ¯ve points on the scale of the thermopair. In this work, we propose a regression model with a random explanatory variable representing the temperature measured by the standard thermopair, which will be called device. The procedure for the comparison accomplishment is as follows. The device and the laboratory\'s thermopair to be tested are placed in the oven and the difererences between the measurements are registered. For the analysis of the data, the response variable is the diference between those diference and the reference value, which is determined by two laboratories that belong to the Brazilian Net of Calibration (RBC). The measurement error has variance determined by calibration which is known. Therefore, we ¯and the maximum likelihood estimates for the parameters of the model via EM algorithm. We consider a strategy to establish the consistency of the participant laboratories of the program of proficiency assay Algoritmo EM Comparação interlaboratorial EM algorithm Ensaios de proficiência Estimação Estimation Incerteza de medição Interlaboratory comparisons Proficiency tests Random explanatory variable Uncertainty measure Variável explicativa aleatória
109	"Uma aplicação industrial de regressão binária com erros na variável explicativa" / "An industrial application of binary regression with errors-in-variable explanatory" Favari, Daniel Fernando de 22 June 2006 (has links) Neste trabalho, aplicamos um modelo de regressão binária com erros de medição na variável explicativa para analisar sistemas de medição do tipo atributo. Para isto, utilizamos o modelo logístico com erros na variável, para o qual obtemos as estimativas de máxima verossimilhança via o algoritmo EM e a matriz de informação de Fisher observada. Além disso, fizemos um estudo de simulação para compararmos o método analítico e os modelos logístico sem erros na variável (ingênuo) e logístico com erros na variável. Finalmente, aplicamos nossa metodologia para avaliarmos um sistema de medição passa/não passa da maior montadora de motores Diesel (MWM International). / In this work, we apply a study of binary regression model with errors-in-variable to analyze attributive measurement systems. For this, we use the logistic model with errors-in-variable to obtain parameter estimates of maximum likelihood through EM algorithm and the observed Fisher information matrix. In addition we do a simulation study to compare analytic method and the logistic model with and without measurement errors-in-variable. Finally, we apply our methodology to evaluate a attributive measurement system for the largest Diesel motor company of the world (MWM International). algoritmo EM analytic method binary regression Delta method EM algorithm erros de medição Fieller's theorem measurement errors-in-variable método analítico método Delta regressão binária teorema de Fieller
110	Modelos lineares generalizados mistos para dados longitudinais. / Generalized linear mixed models in longitudinal data. Costa, Silvano Cesar da 13 March 2003 (has links) Experimentos cujas variaveis respostas s~ ao proporcoes ou contagens, sao muito comuns nas diversas areas do conhecimento, principalmente na area agricola. Na analise desses experimentos, utiliza-se a teoria de modelos lineares generalizados, bastante difundida (McCullagh & Nelder, 1989; Demetrio, 2001), em que as respostas sao independentes. Caso a variancia estimada seja maior do que a esperada, estima-se o parametro de dispersao, incluindo-o no processo de estimaçao dos parametros. Quando a variavel resposta e observada ao longo do tempo, pode haver uma correlacao entre as observacoes e isso tem que ser levado em consideracao na estimacao dos parametros. Uma forma de se trabalhar essa correlacao e aplicando a metodologia de equacoes de estimacao generalizada (EEG), discutida por Liang & Zeger (1986), embora, neste caso, o interesse esteja nas estimativas dos efeitos fixos e a inclusao da matriz de correlacao de trabalho sirva para se obter um melhor ajuste. Uma outra alternativa e a inclusao, no preditor linear, de um efeito latente para captar variabilidades nao consideradas no modelo e que podem in uenciar nos resultados. No presente trabalho, usa-se uma forma combinada de efeito aleatorio e parametro de dispersao, incluidos conjuntamente na estimacao dos parametros. Essa metodologia e aplicada a um conjunto de dados obtidos de um experimento com camu-camu, com objetivo de se avaliarem quais os melhores metodos de enxertia e tipos de porta-enxertos que podem ser utilizados, atraves da proporcao de pegamentos da muda. Varios modelos sao ajustados, desde o modelo em parcelas subdivididas (supondo independencia), ate o modelo em que se considera o parametro de dispersao e efeito aleatorio conjuntamente. Ha evidencias de que o modelo em que se inclui o efeito aleatorio e o parametro de dispersao, conjuntamente, resultam em melhores estimativas dos parametros. Outro conjunto de dados longitudinais, com milho transgenico MON810, em que a variavel resposta e o numero de lagartas (Spodoptera frugiperda), e utilizado. Neste caso, devido ao excesso de respostas zero, emprega-se o modelo de regressao Poisson in acionado de zeros (ZIP), alem do modelo Poisson padrao, em que as observacoes sao consideradas independentes, e do modelo Poisson in acionado de zeros com efeito aleatorio. Os resultados mostram que o efeito aleatorio incluido no preditor foi nao significativo e, assim, o modelo adotado e o modelo de regressao Poisson in acionado de zeros. Os resultados foram obtidos usando-se os procedimentos NLMIXED, GENMOD e GPLOT do SAS - Statistical Analysis System, versao 8.2. / Experiments which response variables are proportions or counts are very common in several research areas, specially in the area of agriculture. The theory of generalized linear models, well difused (McCullagh & Nelder, 1989; Demetrio, 2001), is used for analyzing these experiments where the responses are independent. If the estimated variance is greater than the expected variance, the dispersion parameter is estimated including it on the parameter estimation process. When the response variable is observed over time a correlation among observations might occur and it should be taken into account in the parameter estimation. A way of dealing with this correlation is applying the methodology of generalized estimating equations (GEEs) discussed by Liang & Zeger (1986) although, in this case, the interest is on the estimates of the xed efect being the inclusion of a working correlation matrix useful to obtain more accurate estimates. Another alternative is the inclusion of a latent efect in the linear predictor to explain variabilities not considered in the model that might in uence the results. In this work the random efect and the dispersion parameter are combined and included together in the parameter estimation. Such methodology is applied to a data set obtained from an experiment realized with camu-camu to evaluate, through proportion of grafting well successful of seedling, which kind of grafting and understock are suitable to be used. Several models are fitted, since the split plot model (with independence assumption) up to the model where the dispersion parameter and the random efect are considered together. There is evidence that the model including the random efect and the dispersion parameter together, produce better estimates of the parameters. Another longitudinal data set used here comes from an experiment realized with the MON810 transgenic corn where the response variable is the number of caterpillars (Spodoptera frugiperda). In this case, due to the excessive number of zeros obtained, the zero in ated Poisson regression model (ZIP) is used in addition to the standard Poisson model, where observations are considered independent, and the zero in ated Poisson regression model with random efect. The results show that the random efect included in the linear predictor was not significant and, therefore, the adopted model is the zero in ated Poisson regression model. The results were obtained using the procedures NLMIXED, GENMOD and GPLOT available on SAS - Statistical Analysis System, version 8.2. análise de dados longitudinais binomial distribution distribuição binomial distribuição de poisson em algorithm generalized linear mixed models generalized linear models modelos lineares generalizados poisson distribution SAS (programa de computador)

Search results