Global ETD Search

1	Likelihood Inference for Type I Bivariate Polya-Aeppli Distribution Ye, Yang 11 1900 (has links) The Poisson distribution is commonly used in analyzing count data, and many insurance companies are interested in studying the related risk models and ruin probability theory. Over the past century, many different bivariate models have been developed in the literature. The bivariate Poisson distribution was first introduced by Campbell (1934) for modelling bivariate accident data. However, in some situations, a given dataset may possess over-dispersion compared to Poisson distribution which moti- vated researchers to develop alternative models to handle such situations. In this regard, Minkova and Balakrishnan (2014a) developed the Type I bivariate Polya- Aeppli distribution by using compounding with Geometric random variables and the trivariate reduction method. Inference for this Type I bivariate Polya-Aeppli distribution is the topic of this thesis. The parameters in a model are used to describe and summarize a given sample within a specific distribution. So, their estimation becomes important and the goal of estimation theory is to seek a method to find estimators for the parameters of interest that have some good properties. There exist many methods of finding estimators such as Method of Moments, Bayesian estimators, Least Squares, and Maximum Likelihood Estimators (MLEs). Each method of estimation has its own strength and weakness (Casella and Berger (2008)). Minkova and Balakrishnan (2014a) discussed the moment estimation of the parameters of the Type I bivariate Polya-Aeppli dis- tribution. In this thesis, we develop the likelihood inference for this model. A simulation study is carried out with various parameter settings. The obtained results show that the MLEs require more computational time compared to Moment estimation. However, Method of Moments (MoM) did not result in good estimates for all the simulation settings. In terms of mean squared error and bias, we observed that MLEs performed, in most of the settings, better than MoM. Finally, we apply the Type I bivariate Polya-Aeppli model to a real dataset containing the frequencies of railway accidents in two subsequent six year periods. We also carry out some hypothesis tests using the Wald test statistic. From these results, we conclude that the two variables belong to the same univariate Polya-Aeppli distribution but are correlated. / Thesis / Master of Science (MSc)
2	Aspects of Composite Likelihood Inference Jin, Zi 07 March 2011 (has links) A composite likelihood consists of a combination of valid likelihood objects, and in particular it is of typical interest to adopt lower dimensional marginal likelihoods. Composite marginal likelihood appears to be an attractive alternative for modeling complex data, and has received increasing attention in handling high dimensional data sets when the joint distribution is computationally difficult to evaluate, or intractable due to complex structure of dependence. We present some aspects of methodological development in composite likelihood inference. The resulting estimator enjoys desirable asymptotic properties such as consistency and asymptotic normality. Composite likelihood based test statistics and their asymptotic distributions are summarized. Higher order asymptotic properties of the signed composite likelihood root statistic are explored. Moreover, we aim to compare accuracy and efficiency of composite likelihood estimation relative to estimation based on ordinary likelihood. Analytical and simulation results are presented for different models, which include multivariate normal distributions, times series model, and correlated binary data. Composite likelihood inference Pairwise likelihood Asymptotic relative efficiency Higher-order asymptotics 0463
3	Aspects of Composite Likelihood Inference Jin, Zi 07 March 2011 (has links) A composite likelihood consists of a combination of valid likelihood objects, and in particular it is of typical interest to adopt lower dimensional marginal likelihoods. Composite marginal likelihood appears to be an attractive alternative for modeling complex data, and has received increasing attention in handling high dimensional data sets when the joint distribution is computationally difficult to evaluate, or intractable due to complex structure of dependence. We present some aspects of methodological development in composite likelihood inference. The resulting estimator enjoys desirable asymptotic properties such as consistency and asymptotic normality. Composite likelihood based test statistics and their asymptotic distributions are summarized. Higher order asymptotic properties of the signed composite likelihood root statistic are explored. Moreover, we aim to compare accuracy and efficiency of composite likelihood estimation relative to estimation based on ordinary likelihood. Analytical and simulation results are presented for different models, which include multivariate normal distributions, times series model, and correlated binary data. Composite likelihood inference Pairwise likelihood Asymptotic relative efficiency Higher-order asymptotics 0463
4	Information Matrices in Estimating Function Approach: Tests for Model Misspecification and Model Selection Zhou, Qian January 2009 (has links) Estimating functions have been widely used for parameter estimation in various statistical problems. Regular estimating functions produce parameter estimators which have desirable properties, such as consistency and asymptotic normality. In quasi-likelihood inference, an important example of estimating functions, correct specification of the first two moments of the underlying distribution leads to the information unbiasedness, which states that two forms of the information matrix: the negative sensitivity matrix (negative expectation of the first order derivative of an estimating function) and the variability matrix (variance of an estimating function) are equal, or in other words, the analogue of the Fisher information is equivalent to the Godambe information. Consequently, the information unbiasedness indicates that the model-based covariance matrix estimator and sandwich covariance matrix estimator are equivalent. By comparing the model-based and sandwich variance estimators, we propose information ratio (IR) statistics for testing model misspecification of variance/covariance structure under correctly specified mean structure, in the context of linear regression models, generalized linear regression models and generalized estimating equations. Asymptotic properties of the IR statistics are discussed. In addition, through intensive simulation studies, we show that the IR statistics are powerful in various applications: test for heteroscedasticity in linear regression models, test for overdispersion in count data, and test for misspecified variance function and/or misspecified working correlation structure. Moreover, the IR statistics appear more powerful than the classical information matrix test proposed by White (1982). In the literature, model selection criteria have been intensively discussed, but almost all of them target choosing the optimal mean structure. In this thesis, two model selection procedures are proposed for selecting the optimal variance/covariance structure among a collection of candidate structures. One is based on a sequence of the IR tests for all the competing variance/covariance structures. The other is based on an ``information discrepancy criterion" (IDC), which provides a measurement of discrepancy between the negative sensitivity matrix and the variability matrix. In fact, this IDC characterizes the relative efficiency loss when using a certain candidate variance/covariance structure, compared with the true but unknown structure. Through simulation studies and analyses of two data sets, it is shown that the two proposed model selection methods both have a high rate of detecting the true/optimal variance/covariance structure. In particular, since the IDC magnifies the difference among the competing structures, it is highly sensitive to detect the most appropriate variance/covariance structure. Estimating functions Information unbiasedness Quasi-likelihood inference Model misspecfication Model selection Statistics (Biostatistics)
5	Information Matrices in Estimating Function Approach: Tests for Model Misspecification and Model Selection Zhou, Qian January 2009 (has links) Estimating functions have been widely used for parameter estimation in various statistical problems. Regular estimating functions produce parameter estimators which have desirable properties, such as consistency and asymptotic normality. In quasi-likelihood inference, an important example of estimating functions, correct specification of the first two moments of the underlying distribution leads to the information unbiasedness, which states that two forms of the information matrix: the negative sensitivity matrix (negative expectation of the first order derivative of an estimating function) and the variability matrix (variance of an estimating function) are equal, or in other words, the analogue of the Fisher information is equivalent to the Godambe information. Consequently, the information unbiasedness indicates that the model-based covariance matrix estimator and sandwich covariance matrix estimator are equivalent. By comparing the model-based and sandwich variance estimators, we propose information ratio (IR) statistics for testing model misspecification of variance/covariance structure under correctly specified mean structure, in the context of linear regression models, generalized linear regression models and generalized estimating equations. Asymptotic properties of the IR statistics are discussed. In addition, through intensive simulation studies, we show that the IR statistics are powerful in various applications: test for heteroscedasticity in linear regression models, test for overdispersion in count data, and test for misspecified variance function and/or misspecified working correlation structure. Moreover, the IR statistics appear more powerful than the classical information matrix test proposed by White (1982). In the literature, model selection criteria have been intensively discussed, but almost all of them target choosing the optimal mean structure. In this thesis, two model selection procedures are proposed for selecting the optimal variance/covariance structure among a collection of candidate structures. One is based on a sequence of the IR tests for all the competing variance/covariance structures. The other is based on an ``information discrepancy criterion" (IDC), which provides a measurement of discrepancy between the negative sensitivity matrix and the variability matrix. In fact, this IDC characterizes the relative efficiency loss when using a certain candidate variance/covariance structure, compared with the true but unknown structure. Through simulation studies and analyses of two data sets, it is shown that the two proposed model selection methods both have a high rate of detecting the true/optimal variance/covariance structure. In particular, since the IDC magnifies the difference among the competing structures, it is highly sensitive to detect the most appropriate variance/covariance structure. Estimating functions Information unbiasedness Quasi-likelihood inference Model misspecfication Model selection Statistics (Biostatistics)
6	Exact likelihood inference for multiple exponential populations under joint censoring Su, Feng 04 1900 (has links) <p>The joint censoring scheme is of practical significance while conducting comparative life-tests of products from different units within the same facility. In this thesis, we derive the exact distributions of the maximum likelihood estimators (MLEs) of the unknown parameters when joint censoring of some form is present among the multiple samples, and then discuss the construction of exact confidence intervals for the parameters.</p> <p>We develop inferential methods based on four different joint censoring schemes. The first one is when a jointly Type-II censored sample arising from $k$ independent exponential populations is available. The second one is when a jointly progressively Type-II censored sample is available, while the last two cases correspond to jointly Type-I hybrid censored and jointly Type-II hybrid censored samples. For each one of these cases, we derive the conditional MLEs of the $k$ exponential mean parameters, and derive their conditional moment generating functions and exact densities, using which we then develop exact confidence intervals for the $k$ population parameters. Furthermore, approximate confidence intervals based on the asymptotic normality of the MLEs, parametric bootstrap intervals, and credible confidence regions from a Bayesian viewpoint are all discussed. An empirical evaluation of all these methods of confidence intervals is also made in terms of coverage probabilities and average widths. Finally, we present examples in order to illustrate all the methods of inference developed here for different joint censoring scenarios.</p> / Doctor of Science (PhD) Exponential distribution Joint progressive Type-II censoring Joint Type-II hybrid censoring Joint Type-I hybrid censoring Likelihood inference Bayesian inference Statistical Theory Statistical Theory
7	LIKELIHOOD INFERENCE FOR LEFT TRUNCATED AND RIGHT CENSORED LIFETIME DATA Mitra, Debanjan 04 1900 (has links) <p>Left truncation arises because in many situations, failure of a unit is observed only if it fails after a certain period. In many situations, the units under study may not be followed until all of them fail and the experimenter may have to stop at a certain time when some of the units may still be working. This introduces right censoring into the data. Some commonly used lifetime distributions are lognormal, Weibull and gamma, all of which are special cases of the flexible generalized gamma family. Likelihood inference via the Expectation Maximization (EM) algorithm is used to estimate the model parameters of lognormal, Weibull, gamma and generalized gamma distributions, based on left truncated and right censored data. The asymptotic variance-covariance matrices of the maximum likelihood estimates (MLEs) are derived using the missing information principle. By using the asymptotic variances and the asymptotic normality of the MLEs, asymptotic confidence intervals for the parameters are constructed. For comparison purpose, Newton-Raphson (NR) method is also used for the parameter estimation, and asymptotic confidence intervals corresponding to the NR method and parametric bootstrap are also obtained. Through Monte Carlo simulations, the performance of all these methods of inference are studied. With regard to prediction analysis, the probability that a right censored unit will be working until a future year is estimated, and an asymptotic confidence interval for the probability is then derived by the delta-method. All the methods of inference developed here are illustrated with some numerical examples.</p> / Doctor of Philosophy (PhD) Lifetime data Left truncation Right censoring Likelihood inference EM algorithm Missing information principle Physical Sciences and Mathematics Statistical Methodology Physical Sciences and Mathematics
8	Outils et modèles pour l'étude de quelques risques spatiaux et en réseaux : application aux extrêmes climatiques et à la contagion en finance / Tools and models for the study of some spatial and network risks : application to climate extremes and contagion in finance Koch, Erwan 02 July 2014 (has links) Cette thèse s’attache à développer des outils et modèles adaptés a l’étude de certains risques spatiaux et en réseaux. Elle est divisée en cinq chapitres. Le premier consiste en une introduction générale, contenant l’état de l’art au sein duquel s’inscrivent les différents travaux, ainsi que les principaux résultats obtenus. Le Chapitre 2 propose un nouveau générateur de précipitations multi-site. Il est important de disposer de modèles capables de produire des séries de précipitations statistiquement réalistes. Alors que les modèles précédemment introduits dans la littérature concernent essentiellement les précipitations journalières, nous développons un modèle horaire. Il n’implique qu’une seule équation et introduit ainsi une dépendance entre occurrence et intensité, processus souvent considérés comme indépendants dans la littérature. Il comporte un facteur commun prenant en compte les conditions atmosphériques grande échelle et un terme de contagion auto-regressif multivarié, représentant la propagation locale des pluies. Malgré sa relative simplicité, ce modèle reproduit très bien les intensités, les durées de sècheresse ainsi que la dépendance spatiale dans le cas de la Bretagne Nord. Dans le Chapitre 3, nous proposons une méthode d’estimation des processus maxstables, basée sur des techniques de vraisemblance simulée. Les processus max-stables sont très adaptés à la modélisation statistique des extrêmes spatiaux mais leur estimation s’avère délicate. En effet, la densité multivariée n’a pas de forme explicite et les méthodes d’estimation standards liées à la vraisemblance ne peuvent donc pas être appliquées. Sous des hypothèses adéquates, notre estimateur est efficace quand le nombre d’observations temporelles et le nombre de simulations tendent vers l’infini. Cette approche par simulation peut être utilisée pour de nombreuses classes de processus max-stables et peut fournir de meilleurs résultats que les méthodes actuelles utilisant la vraisemblance composite, notamment dans le cas où seules quelques observations temporelles sont disponibles et où la dépendance spatiale est importante / This thesis aims at developing tools and models that are relevant for the study of some spatial risks and risks in networks. The thesis is divided into five chapters. The first one is a general introduction containing the state of the art related to each study as well as the main results. Chapter 2 develops a new multi-site precipitation generator. It is crucial to dispose of models able to produce statistically realistic precipitation series. Whereas previously introduced models in the literature deal with daily precipitation, we develop a hourly model. The latter involves only one equation and thus introduces dependence between occurrence and intensity; the aforementioned literature assumes that these processes are independent. Our model contains a common factor taking large scale atmospheric conditions into account and a multivariate autoregressive contagion term accounting for local propagation of rainfall. Despite its relative simplicity, this model shows an impressive ability to reproduce real intensities, lengths of dry periods as well as the spatial dependence structure. In Chapter 3, we propose an estimation method for max-stable processes, based on simulated likelihood techniques. Max-stable processes are ideally suited for the statistical modeling of spatial extremes but their inference is difficult. Indeed the multivariate density function is not available and thus standard likelihood-based estimation methods cannot be applied. Under appropriate assumptions, our estimator is efficient as both the temporal dimension and the number of simulation draws tend towards infinity. This approach by simulation can be used for many classes of max-stable processes and can provide better results than composite-based methods, especially in the case where only a few temporal observations are available and the spatial dependence is high Contagion Diversification Extrêmes spatiaux Facteur commun Générateur de précipitations Mesures de risque Modèle spatio-temporel Contagion Diversification Spatial extremes Common factor Precipitation generator Risk measures Spatial-temporal model 510
9	Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data Rahal, Abbas 14 July 2021 (has links) The local false discovery rate (LFDR) is one of many existing statistical methods that analyze multiple hypothesis testing. As a Bayesian quantity, the LFDR is based on the prior probability of the null hypothesis and a mixture distribution of null and non-null hypothesis. In practice, the LFDR is unknown and needs to be estimated. The empirical Bayes approach can be used to estimate that mixture distribution. Empirical Bayes does not require complete information about the prior and hyper prior distributions as in hierarchical Bayes. When we do not have enough information at the prior level, and instead of placing a distribution at the hyper prior level in the hierarchical Bayes model, empirical Bayes estimates the prior parameters using the data via, often, the marginal distribution. In this research, we developed new Bayesian methods under unknown prior distribution. A set of adequate prior distributions maybe defined using Bayesian model checking by setting a threshold on the posterior predictive p-value, prior predictive p-value, calibrated p-value, Bayes factor, or integrated likelihood. We derive a set of adequate posterior distributions from that set. In order to obtain a single posterior distribution instead of a set of adequate posterior distributions, we used a blended distribution, which minimizes the relative entropy of a set of adequate prior (or posterior) distributions to a "benchmark" prior (or posterior) distribution. We present two approaches to generate a blended posterior distribution, namely, updating-before-blending and blending-before-updating. The blended posterior distribution can be used to estimate the LFDR by considering the nonlocal false discovery rate as a benchmark and the different LFDR estimators as an adequate set. The likelihood ratio can often be misleading in multiple testing, unless it is supplemented by adjusted p-values or posterior probabilities based on sufficiently strong prior distributions. In case of unknown prior distributions, they can be estimated by empirical Bayes methods or blended distributions. We propose a general framework for applying the laws of likelihood to problems involving multiple hypotheses by bringing together multiple statistical models. We have applied the proposed framework to data sets from genomics, COVID-19 and other data. Robust Bayesian statistics Imprecise probability Bayesian model checking Blended inference Posterior predictive p-value Local false discovery rate Empirical Bayes Multiple testing Bayesian false discovery rate Measure of evidence Direct likelihood inference Likelihoodism

Search results