Global ETD Search

111	Diagnóstico de influência em modelos com erros na variável skew-normal/independente / Influence of diagnostic in models with errors in variable skew-normal/independent Carvalho, Rignaldo Rodrigues 17 August 2018 (has links) Orientadores: Victor Hugo Lachos Dávila, Filidor Edilfonso Vilca Labra / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-17T09:37:18Z (GMT). No. of bitstreams: 1 Carvalho_RignaldoRodrigues_M.pdf: 1849605 bytes, checksum: 07ea5638a2dbfa2227f9a949d4723bbf (MD5) Previous issue date: 2010 / Resumo: O modelo de medição de Barnett é frequentemente usado para comparar vários instrumentos de medição. é comum assumir que os termos aleatórios têm uma distribuição normal. Entretanto, tal suposição faz a inferência vulnerável a observações atípicas por outro lado distribuições de misturas de escala skew-normal tem sido uma interessante alternativa para produzir estimativas robustas tendo a elegância e simplicidade da teoria da máxima verossimilhança. Nós usamos resultados de Lachos et al. (2008) para obter a estimação dos parâmetros via máxima verossimilhança, baseada no algoritmo EM, o qual rende expressões de forma fechada para as equações no passo M. Em seguida desenvolvemos o método de influência local de Zhu e Lee (2001) para avaliar os aspectos de estimação dos parâmetros sob alguns esquemas de perturbação. Os resultados obtidos são aplicados a conjuntos de dados bastante estudados na literatura, ilustrando a utilidade da metodologia proposta / Abstract: The Barnett measurement model is frequently used to comparing several measuring devices. It is common to assume that the random terms have a normal distribution. However, such assumption makes the inference vulnerable to outlying observations whereas scale mixtures of skew-normal distributions have been an interesting alternative to produce robust estimates keeping the elegancy and simplicity of the maximum likelihood theory. We used results in Lachos et al. (2008) for obtaining parameter estimation via maximum likelihood, based on the EM-algorithm, which yields closed form expressions for the equations in the M-step. Then we developed the local influence method to assessing the robustness aspects of these parameter estimates under some usual perturbation schemes. Results obtained for one real data set are reported, illustrating the usefulness of the proposed methodology / Mestrado / Métodos Estatísticos / Mestre em Estatística Algoritmos de esperança-maximização Distribuição normal assimétrica Influência local (Estatística) Misturas de escala (Estatística) Mahalanobis, Distância de Expectation-maximization algorithms Skew-normal distributions Local influence (Statistics) Scale mixtures Mahalanobis distance
112	Misturas finitas de misturas de escala skew-normal / Mixtures modelling using scale mixtures of skew-normal distribution Basso, Rodrigo Marreiro 03 December 2009 (has links) Orientador: Victor Hugo Lachos Davila / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica / Made available in DSpace on 2018-08-13T07:03:11Z (GMT). No. of bitstreams: 1 Basso_RodrigoMarreiro_M.pdf: 3130269 bytes, checksum: 85e95beb812a4ec069f39f8b9c79681a (MD5) Previous issue date: 2009 / Resumo: Nesse trabalho será considerada uma classe flexível de modelos usando misturas finitas de distribuições da classe de misturas de escala skew-normal. O algoritmo EM é empregado para se obter estimativas de máxima verossimilhança de maneira iterativa, sendo discutido com maior ênfase para misturas de distribuições skew-normal, skew-t, skew-slash e skew-normal contaminada. Também será apresentado um método geral para aproximar a matrix de covariância assintótica das estimativas de máxima verossimilhança. Resultados obtidos da análise de quatro conjuntos de dados reais ilustram a aplicabilidade da metodologia proposta / Abstract: In this work we consider a flexible class of models using finite mixtures of multivariate scale mixtures of skew-normal distributions. An EM-type algorithm is employed for iteratively computing maximum likelihood estimates and this is discussed with emphasis on finite mixtures of skew-normal, skew-t, skew-slash and skew-contaminated normal distributions. A general information-based method for approximating the asymptotic covariance matrix of the maximum likelihood estimates is also presented. Results obtained from the analysis of four real data sets are reported illustrating the usefulness of the proposed methodology / Mestrado / Mestre em Estatística Algoritmos de esperança-maximização Distribuição (Probabilidades) Misturas finitas Distribuição normal assimétrica Misturas de escala (Estatística) Expectation-maximization algorithms Distribution (Probability) Finite mixtures Skew-normal distribution Scale mixtures
113	Apprentissage à partir de données et de connaissances incertaines : application à la prédiction de la qualité du caoutchouc / Learning from uncertain data and knowledge : application to the natural rubber quality prediction Sutton-Charani, Nicolas 28 May 2014 (has links) Pour l’apprentissage de modèles prédictifs, la qualité des données disponibles joue un rôle important quant à la fiabilité des prédictions obtenues. Ces données d’apprentissage ont, en pratique, l’inconvénient d’être très souvent imparfaites ou incertaines (imprécises, bruitées, etc). Ce travail de doctorat s’inscrit dans ce cadre où la théorie des fonctions de croyance est utilisée de manière à adapter des outils statistiques classiques aux données incertaines.Le modèle prédictif choisi est l’arbre de décision qui est un classifieur basique de l’intelligence artificielle mais qui est habituellement construit à partir de données précises. Le but de la méthodologie principale développée dans cette thèse est de généraliser les arbres de décision aux données incertaines (floues, probabilistes,manquantes, etc) en entrée et en sortie. L’outil central d’extension des arbres de décision aux données incertaines est une vraisemblance adaptée aux fonctions de croyance récemment proposée dans la littérature dont certaines propriétés sont ici étudiées de manière approfondie. De manière à estimer les différents paramètres d’un arbre de décision, cette vraisemblance est maximisée via l’algorithme E2M qui étend l’algorithme EM aux fonctions de croyance. La nouvelle méthodologie ainsi présentée, les arbres de décision E2M, est ensuite appliquée à un cas réel : la prédiction de la qualité du caoutchouc naturel. Les données d’apprentissage, essentiellement culturales et climatiques, présentent de nombreuses incertitudes qui sont modélisées par des fonctions de croyance adaptées à ces imperfections. Après une étude statistique standard de ces données, des arbres de décision E2M sont construits et évalués en comparaison d’arbres de décision classiques. Cette prise en compte des incertitudes des données permet ainsi d’améliorer très légèrement la qualité de prédiction mais apporte surtout des informations concernant certaines variables peu prises en compte jusqu’ici par les experts du caoutchouc. / During the learning of predictive models, the quality of available data is essential for the reliability of obtained predictions. These learning data are, in practice very often imperfect or uncertain (imprecise, noised, etc). This PhD thesis is focused on this context where the theory of belief functions is used in order to adapt standard statistical tools to uncertain data.The chosen predictive model is decision trees which are basic classifiers in Artificial Intelligence initially conceived to be built from precise data. The aim of the main methodology developed in this thesis is to generalise decision trees to uncertain data (fuzzy, probabilistic, missing, etc) in input and in output. To realise this extension to uncertain data, the main tool is a likelihood adapted to belief functions,recently presented in the literature, whose behaviour is here studied. The maximisation of this likelihood provide estimators of the trees’ parameters. This maximisation is obtained via the E2M algorithm which is an extension of the EM algorithm to belief functions.The presented methodology, the E2M decision trees, is applied to a real case : the natural rubber quality prediction. The learning data, mainly cultural and climatic,contains many uncertainties which are modelled by belief functions adapted to those imperfections. After a simple descriptiv statistic study of the data, E2M decision trees are built, evaluated and compared to standard decision trees. The taken into account of the data uncertainty slightly improves the predictive accuracy but moreover, the importance of some variables, sparsely studied until now, is highlighted. Données incertaines Connaissances incertaines Fonctions de croyance Vraisemblance Prédiction Classification Caoutchouc naturel Predictives models Belief functions Decision trees Expectation-maximization algorithms Rubber
114	Modelos para dados censurados sob a classe de distribuições misturas de escala skew-normal / Censored regression models under the class of scale mixture of skew-normal distributions Massuia, Monique Bettio, 1989- 03 June 2015 (has links) Orientador: Víctor Hugo Lachos Dávila / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica / Made available in DSpace on 2018-08-26T19:55:07Z (GMT). No. of bitstreams: 1 Massuia_MoniqueBettio_M.pdf: 2926597 bytes, checksum: 2a1154c0a61b13f369e8390159fc4c3e (MD5) Previous issue date: 2015 / Resumo: Este trabalho tem como objetivo principal apresentar os modelos de regressão lineares com respostas censuradas sob a classe de distribuições de mistura de escala skew-normal (SMSN), visando generalizar o clássico modelo Tobit ao oferecer alternativas mais robustas à distribuição Normal. Um estudo de inferência clássico é desenvolvido para os modelos em questão sob dois casos especiais desta família de distribuições, a Normal e a t de Student, utilizando o algoritmo EM para obter as estimativas de máxima verossimilhança dos parâmetros dos modelos e desenvolvendo métodos de diagnóstico de influência global e local com base na metodologia proposta por Cook (1986) e Poom & Poon (1999). Sob o enfoque Bayesiano, o modelo de regressão para respostas censuradas é estudado sob alguns casos especiais da classe SMSN, como a Normal, a t de Student, a skew-Normal, a skew-t e a skew-Slash. Neste caso, o amostrador de Gibbs é a principal ferramenta utilizada para a inferência sobre os parâmetros do modelo. Apresentamos também alguns estudos de simulação para avaliar a metodologia desenvolvida que, por fim, é aplicada em dois conjuntos de dados reais. Os pacotes SMNCensReg, CensRegMod e BayesCR para o software R dão suporte computacional aos desenvolvimentos deste trabalho / Abstract: This work aims to present the linear regression model with censored response variable under the class of scale mixture of skew-normal distributions (SMSN), generalizing the well known Tobit model as providing a more robust alternative to the normal distribution. A study based on classic inference is developed to investigate these censored models under two special cases of this family of distributions, Normal and t-Student, using the EM algorithm for obtaining maximum likelihood estimates and developing methods of diagnostic based on global and local influence as suggested by Cook (1986) and Poom & Poon (1999). Under a Bayesian approach, the censored regression model was studied under some special cases of SMSN class, such as Normal, t-Student, skew-Normal, skew-t and skew-Slash. In these cases, the Gibbs sampler was the main tool used to make inference about the model parameters. We also present some simulation studies for evaluating the developed methodologies that, finally, are applied on two real data sets. The packages SMNCensReg, CensRegMod and BayesCR implemented for the software R give computational support to this work / Mestrado / Estatistica / Mestra em Estatística Modelos lineares (Estatistica) Análise de regressão Distribuição normal assimétrica Algoritmos de esperança-maximização Linear models (Statistics) Regression analysis Skew-normal distributions Expectation-maximization algorithms
115	An Analysis of Markov Regime-Switching Models for Weather Derivative Pricing Gerdin Börjesson, Fredrik January 2021 (has links) The valuation of weather derivatives is greatly dependent on accurate modeling and forecasting of the underlying temperature indices. The complexity and uncertainty in such modeling has led to several temperature processes being developed for the Monte Carlo simulation of daily average temperatures. In this report, we aim to compare the results of two recently developed models by Gyamerah et al. (2018) and Evarest, Berntsson, Singull, and Yang (2018). The paper gives a thorough introduction to option theory, Lévy and Wiener processes, and generalized hyperbolic distributions frequently used in temperature modeling. Implementations of maximum likelihood estimation and the expectation-maximization algorithm with Kim's smoothed transition probabilities are used to fit the Lévy process distributions and both models' parameters, respectively. Later, the use of both models is considered for the pricing of European HDD and CDD options by Monte Carlo simulation. The evaluation shows a tendency toward the shifted temperature regime over the base regime, in contrast to the two articles, when evaluated for three data sets. Simulation is successfully demonstrated for the model of Evarest, however Gyamerah's model was unable to be replicated. This is concluded to be due to the two articles containing several incorrect derivations, why the thesis is left unanswered and the articles' conclusions are questioned. We end by proposing further validation of the two models and summarize the alterations required for a correct implementation. Weather derivatives temperature modeling Markov switching models Lévy processes expectation-maximization algorithm generalized hyperbolic distributions Monte Carlo simulation Probability Theory and Statistics Sannolikhetsteori och statistik
116	Statistical inference of time-dependent data Suhas Gundimeda (5930648) 11 May 2020 (has links) Probabilistic graphical modeling is a framework which can be used to succinctly<br>represent multivariate probability distributions of time series in terms of each time<br>series’s dependence on others. In general, it is computationally prohibitive to sta-<br>tistically infer an arbitrary model from data. However, if we constrain the model to<br>have a tree topology, the corresponding learning algorithms become tractable. The<br>expressive power of tree-structured distributions are low, since only n − 1 dependen-<br>cies are explicitly encoded for an n node tree. One way to improve the expressive<br>power of tree models is to combine many of them in a mixture model. This work<br>presents and uses simulations to validate extensions of the standard mixtures of trees<br>model for i.i.d data to the setting of time series data. We also consider the setting<br>where the tree mixture itself forms a hidden Markov chain, which could be better<br>suited for approximating time-varying seasonal data in the real world. Both of these<br>are evaluated on artificial data sets.<br><br> Operations Research tree distribution tree approximation time series structure inference expectation maximization algorithm Baum Welch mixtures of trees mixtures of time dependent trees
117	A Classification Tool for Predictive Data Analysis in Healthcare Victors, Mason Lemoyne 07 March 2013 (has links) (PDF) Hidden Markov Models (HMMs) have seen widespread use in a variety of applications ranging from speech recognition to gene prediction. While developed over forty years ago, they remain a standard tool for sequential data analysis. More recently, Latent Dirichlet Allocation (LDA) was developed and soon gained widespread popularity as a powerful topic analysis tool for text corpora. We thoroughly develop LDA and a generalization of HMMs and demonstrate the conjunctive use of both methods in predictive data analysis for health care problems. While these two tools (LDA and HMM) have been used in conjunction previously, we use LDA in a new way to reduce the dimensionality involved in the training of HMMs. With both LDA and our extension of HMM, we train classifiers to predict development of Chronic Kidney Disease (CKD) in the near future. predictive data analysis Hidden Markov Models Latent Dirichlet Allocation health care convex analysis Markov chains Expectation Maximization Gibbs sampling classification tree random forest Mathematics
118	Semi-supervised adverse drug reaction detection / Halvvägledd upptäckt av läkemedelsreleterade biverkningar Ohl, Louis January 2021 (has links) Pharmacogivilance consists in carefully monitoring drugs in order to re-evaluate their risk for people’s health. The sooner the Adverse Drug Reactions are detected, the sooner one can act consequently. This thesis aims at discovering such reactions in electronical health records under the constraint of lacking annotated data, in order to replicate the scenario of the Regional Center for Pharmacovigilance of Nice. We investigate how in a semi-supervised learning design the unlabeled data can contribute to improve classification scores. Results suggest an excellent recall in discovering adverse reactions and possible classification improvements under specific data distribution. / Läkemedelsövervakningen består i kolla försiktigt läkemedlen så att utvärdera dem för samhällets hälsa. Ju tidigare de läkemedelsrelaterade biverkningarna upptäcks, desto tidigare man får handla dem. Detta exjobb söker att upptäcka de där läkemedelsrelaterade biverkningarnna inom elektroniska hälsopost med få datamärkningar, för att återskapa Nice regionalt läkemedelelsöveraknings-centrumets situationen. Vi undersöker hur en halvväglett lärande lösning kan hjälpa att förbättra klassificeringsresultat. Resultaten visar en god återställning med biverknings-upptäckning och möjliga förbättringar. Semi-supervised learning Text Classification Adverse Drug Reaction Expectation Maximization Natural Language Processing Halvväglett lärande Text klassificering Läkemedelsrelaterade biverkningar Naturlig språkbehandling Computer Sciences Datavetenskap (datalogi)
119	SEGMENTATION OF WHITE MATTER, GRAY MATTER, AND CSF FROM MR BRAIN IMAGES AND EXTRACTION OF VERTEBRAE FROM MR SPINAL IMAGES PENG, ZHIGANG 02 October 2006 (has links) No description available. Image Segmentation Spatial-Varying Gaussian Mixture Statistical Method Markov Random Field MR Brain Images MR Spinal Images Intensity Inhomogeneity White Matter Gray Matter CSF Expectation Maximization
120	Some Advanced Semiparametric Single-index Modeling for Spatially-Temporally Correlated Data Mahmoud, Hamdy F. F. 09 October 2014 (has links) Semiparametric modeling is a hybrid of the parametric and nonparametric modelings where some function forms are known and others are unknown. In this dissertation, we have made several contributions to semiparametric modeling based on the single index model related to the following three topics: the first is to propose a model for detecting change points simultaneously with estimating the unknown function; the second is to develop two models for spatially correlated data; and the third is to further develop two models for spatially-temporally correlated data. To address the first topic, we propose a unified approach in its ability to simultaneously estimate the nonlinear relationship and change points. We propose a single index change point model as our unified approach by adjusting for several other covariates. We nonparametrically estimate the unknown function using kernel smoothing and also provide a permutation based testing procedure to detect multiple change points. We show the asymptotic properties of the permutation testing based procedure. The advantage of our approach is demonstrated using the mortality data of Seoul, Korea from January, 2000 to December, 2007. On the second topic, we propose two semiparametric single index models for spatially correlated data. One additively separates the nonparametric function and spatially correlated random effects, while the other does not separate the nonparametric function and spatially correlated random effects. We estimate these two models using two algorithms based on Markov Chain Expectation Maximization algorithm. Our approaches are compared using simulations, suggesting that the semiparametric single index nonadditive model provides more accurate estimates of spatial correlation. The advantage of our approach is demonstrated using the mortality data of six cities, Korea from January, 2000 to December, 2007. The third topic involves proposing two semiparametric single index models for spatially and temporally correlated data. Our first model has the nonparametric function which can separate from spatially and temporally correlated random effects. We refer it to "semiparametric spatio-temporal separable single index model (SSTS-SIM)", while the second model does not separate the nonparametric function from spatially correlated random effects but separates the time random effects. We refer our second model to "semiparametric nonseparable single index model (SSTN-SIM)". Two algorithms based on Markov Chain Expectation Maximization algorithm are introduced to simultaneously estimate parameters, spatial effects, and times effects. The proposed models are then applied to the mortality data of six major cities in Korea. Our results suggest that SSTN-SIM is more flexible than SSTS-SIM because it can estimate various nonparametric functions while SSTS-SIM enforces the similar nonparametric curves. SSTN-SIM also provides better estimation and prediction. / Ph. D. Change Point Generalized Linear Model Generalized Additive Model Markov Chain Expectation Maximization Mixed model Permutation Test Semiparametric regression Single Index model Spatially correlated data Spatio-temporal data.

Search results