Global ETD Search

11	Multivariate ordinal regression models: an analysis of corporate credit ratings Hirk, Rainer, Hornik, Kurt, Vana, Laura January 2018 (has links) (PDF) Correlated ordinal data typically arises from multiple measurements on a collection of subjects. Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, we consider multivariate ordinal regression models with a latent variable specification and correlated error terms. Two different link functions are employed, by assuming a multivariate normal and a multivariate logistic distribution for the latent variables underlying the ordinal outcomes. Composite likelihood methods, more specifically the pairwise and tripletwise likelihood approach, are applied for estimating the model parameters. Using simulated data sets with varying number of subjects, we investigate the performance of the pairwise likelihood estimates and find them to be robust for both link functions and reasonable sample size. The empirical application consists of an analysis of corporate credit ratings from the big three credit rating agencies (Standard & Poor's, Moody's and Fitch). Firm-level and stock price data for publicly traded US firms as well as an unbalanced panel of issuer credit ratings are collected and analyzed to illustrate the proposed framework.
12	Multivariate Ordinal Regression Models: An Analysis of Corporate Credit Ratings Hirk, Rainer, Hornik, Kurt, Vana, Laura 01 1900 (has links) (PDF) Correlated ordinal data typically arise from multiple measurements on a collection of subjects. Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, we consider multivariate ordinal models with a latent variable specification and correlated error terms. Two different link functions are employed, by assuming a multivariate normal and a multivariate logistic distribution for the latent variables underlying the ordinal outcomes. Composite likelihood methods, more specifically the pairwise and tripletwise likelihood approach, are applied for estimating the model parameters. We investigate how sensitive the pairwise likelihood estimates are to the number of subjects and to the presence of observations missing completely at random, and find that these estimates are robust for both link functions and reasonable sample size. The empirical application consists of an analysis of corporate credit ratings from the big three credit rating agencies (Standard & Poor's, Moody's and Fitch). Firm-level and stock price data for publicly traded US companies as well as an incomplete panel of issuer credit ratings are collected and analyzed to illustrate the proposed framework. / Series: Research Report Series / Department of Statistics and Mathematics
13	Speciation Time and Hybridization Under Multispecies Coalescent: Estimation and Hypothesis Testing Peng, Jing January 2021 (has links) No description available. Bioinformatics Biostatistics Evolution and Development
14	Méthodes d'inférence statistique pour champs de Gibbs / Statistical inference methods for Gibbs random fields Stoehr, Julien 29 October 2015 (has links) La constante de normalisation des champs de Markov se présente sous la forme d'une intégrale hautement multidimensionnelle et ne peut être calculée par des méthodes analytiques ou numériques standard. Cela constitue une difficulté majeure pour l'estimation des paramètres ou la sélection de modèle. Pour approcher la loi a posteriori des paramètres lorsque le champ de Markov est observé, nous remplaçons la vraisemblance par une vraisemblance composite, c'est à dire un produit de lois marginales ou conditionnelles du modèle, peu coûteuses à calculer. Nous proposons une correction de la vraisemblance composite basée sur une modification de la courbure au maximum afin de ne pas sous-estimer la variance de la loi a posteriori. Ensuite, nous proposons de choisir entre différents modèles de champs de Markov cachés avec des méthodes bayésiennes approchées (ABC, Approximate Bayesian Computation), qui comparent les données observées à de nombreuses simulations de Monte-Carlo au travers de statistiques résumées. Afin de pallier l'absence de statistiques exhaustives pour ce choix de modèle, des statistiques résumées basées sur les composantes connexes des graphes de dépendance des modèles en compétition sont introduites. Leur efficacité est étudiée à l'aide d'un taux d'erreur conditionnel original mesurant la puissance locale de ces statistiques à discriminer les modèles. Nous montrons alors que nous pouvons diminuer sensiblement le nombre de simulations requises tout en améliorant la qualité de décision, et utilisons cette erreur locale pour construire une procédure ABC qui adapte le vecteur de statistiques résumés aux données observées. Enfin, pour contourner le calcul impossible de la vraisemblance dans le critère BIC (Bayesian Information Criterion) de choix de modèle, nous étendons les approches champs moyens en substituant la vraisemblance par des produits de distributions de vecteurs aléatoires, à savoir des blocs du champ. Le critère BLIC (Block Likelihood Information Criterion), que nous en déduisons, permet de répondre à des questions de choix de modèle plus large que les méthodes ABC, en particulier le choix conjoint de la structure de dépendance et du nombre d'états latents. Nous étudions donc les performances de BLIC dans une optique de segmentation d'images. / Due to the Markovian dependence structure, the normalizing constant of Markov random fields cannot be computed with standard analytical or numerical methods. This forms a central issue in terms of parameter inference or model selection as the computation of the likelihood is an integral part of the procedure. When the Markov random field is directly observed, we propose to estimate the posterior distribution of model parameters by replacing the likelihood with a composite likelihood, that is a product of marginal or conditional distributions of the model easy to compute. Our first contribution is to correct the posterior distribution resulting from using a misspecified likelihood function by modifying the curvature at the mode in order to avoid overly precise posterior parameters.In a second part we suggest to perform model selection between hidden Markov random fields with approximate Bayesian computation (ABC) algorithms that compare the observed data and many Monte-Carlo simulations through summary statistics. To make up for the absence of sufficient statistics with regard to this model choice, we introduce summary statistics based on the connected components of the dependency graph of each model in competition. We assess their efficiency using a novel conditional misclassification rate that evaluates their local power to discriminate between models. We set up an efficient procedure that reduces the computational cost while improving the quality of decision and using this local error rate we build up an ABC procedure that adapts the summary statistics to the observed data.In a last part, in order to circumvent the computation of the intractable likelihood in the Bayesian Information Criterion (BIC), we extend the mean field approaches by replacing the likelihood with a product of distributions of random vectors, namely blocks of the lattice. On that basis, we derive BLIC (Block Likelihood Information Criterion) that answers model choice questions of a wider scope than ABC, such as the joint selection of the dependency structure and the number of latent states. We study the performances of BLIC in terms of image segmentation. Méthodes de Monte-Carlo Champs de Markov Statistique bayésienne Sélection de modèle Méthodes ABC Vraisemblances composites Monte-Carlo methods Markov random fields Bayesian statistics Model selection Approximate Bayesian computation Composite likelihood
15	Momentové metody odhadu parametrů časoprostorových shlukových bodových procesů / Moment estimation methods for space-time cluster point processes Kučera, Petr January 2019 (has links) This paper is concerned with estimation of space-time shot-noise Cox process parametric models. We introduce the two-step estimation method, where in the second step we use composite likelihood or Palm likelihood. For the two-step estimation method based on Palm likelihood we prove consistency and asymptotic normality theorem. Finally we compare composite likelihood with Palm likelihood in simulation studies, where we add for comparison minimum contrast method. Results for minimum contrast method are taken from the literature. 1
16	Sur l’inférence statistique pour des processus spatiaux et spatio-temporels extrêmes / On statistical inference for spatial and spatio-temporal extreme processes Abu-Awwad, Abdul-Fattah 20 June 2019 (has links) Les catastrophes naturelles comme les canicules, les tempêtes ou les précipitations extrêmes, proviennent de processus physiques et ont, par nature, une dimension spatiale ou spatiotemporelle. Le développement de modèles et de méthodes d'inférences pour ces processus est un domaine de recherche très actif. Cette thèse traite de l'inférence statistique pour les événements extrêmes dans le cadre spatial et spatio-temporel. En particulier, nous nous intéressons à deux classes de processus stochastique: les processus spatiaux max-mélange et les processus max-stable spatio-temporels. Nous illustrons les résultats obtenus sur des données de précipitations dans l'Est de l'Australie et dans une région de la Floride aux Etats-Unis. Dans la partie spatiale, nous proposons deux tests sur le paramètre de mélange a d'un processus spatial max-mélange: le test statistique Za et le rapport de vraisemblance par paire LRa. Nous comparons les performances de ces tests sur simulations. Nous utilisons la vraisemblance par paire pour l'estimation. Dans l'ensemble, les performances des deux tests sont satisfaisantes. Toutefois, les tests rencontrent des difficultés lorsque le paramètre a se situe à la frontière de l'espace des paramètres, i.e., a ∈ {0,1}, dues à la présence de paramètre de “nuisance” qui ne sont pas identifiés sous l'hypothèse nulle. Nous appliquons ces tests dans le cadre d'une analyse d'excès au delà d'un grand seuil pour des données de précipitations dans l'Est de l'Australie. Nous proposons aussi une nouvelle procédure d'estimation pour ajuster des processus spatiaux max-mélanges lorsqu'on ne connait pas la classe de dépendance extrêmal. La nouveauté de cette procédure est qu'elle permet de faire de l'inférence sans spécifier au préalable la famille de distributions, laissant ainsi parle les données et guider l'estimation. En particulier, la procédure d'estimation utilise un ajustement par la méthode des moindres carrés sur l'expression du Fλ-madogramme d'un modèle max-mélange qui contient les paramètres d'intérêt. Nous montrons la convergence de l'estimateur du paramètre de mélange a. Une indication sur la normalité asymptotique est donnée numériquement. Une étude sur simulation montrent que la méthode proposée améliore les coefficients empiriques pour la classe de modèles max-mélange. Nous implémentons notre procédure d'estimations sur des données de maximas mensuels de précipitations en Australie dans un but exploratoire et confirmatoire. Dans la partie spatio-temporelle, nous proposons une méthode d'estimation semi-paramétrique pour les processus max-stables spatio-temporels en nous basant sur une expression explicite du F-madogramme spatio-temporel. Cette partie permet de faire le pont entre la géostatistique et la théorie des valeurs extrêmes. En particulier, pour des observations sur grille régulière, nous estimons le F-madogramme spatio-temporel par sa version empirique et nous appliquons une procédure basée sur les moments pour obtenir les estimations des paramètres d'intérêt. Nous illustrons les performances de cette procédure par une étude sur simulations. Ensuite, nous appliquons cette méthode pour quantifier le comportement extrêmal de maximum de données radar de précipitations dans l'Etat de Floride. Cette méthode peut être une alternative ou une première étape pour la vraisemblance composite. En effet, les estimations semi-paramétriques pourrait être utilisées comme point de départ pour les algorithmes d'optimisation utilisés dans la méthode de vraisemblance par paire, afin de réduire le temps de calcul mais aussi d'améliorer l'efficacité de la méthode / Natural hazards such as heat waves, extreme wind speeds, and heavy rainfall, arise due to physical processes and are spatial or spatio-temporal in extent. The development of models and inference methods for these processes is a very active area of research. This thesis deals with the statistical inference of extreme and rare events in both spatial and spatio-temporal settings. Specifically, our contributions are dedicated to two classes of stochastic processes: spatial max-mixture processes and space-time max-stable processes. The proposed methodologies are illustrated by applications to rainfall data collected from the East of Australia and from a region in the State of Florida, USA. In the spatial part, we consider hypothesis testing for the mixture parameter a of a spatial maxmixture model using two classical statistics: the Z-test statistic Za and the pairwise likelihood ratio statistic LRa. We compare their performance through an extensive simulation study. The pairwise likelihood is employed for estimation purposes. Overall, the performance of the two statistics is satisfactory. Nevertheless, hypothesis testing presents some difficulties when a lies on the boundary of the parameter space, i.e., a ∈ {0,1}, due to the presence of additional nuisance parameters which are not identified under the null hypotheses. We apply this testing framework in an analysis of exceedances over a large threshold of daily rainfall data from the East of Australia. We also propose a novel estimation procedure to fit spatial max-mixture processes with unknown extremal dependence class. The novelty of this procedure is to provide a way to make inference without specifying the distribution family prior to fitting the data. Hence, letting the data speak for themselves. In particular, the estimation procedure uses nonlinear least squares fit based on a closed form expression of the so-called Fλ-madogram of max-mixture models which contains the parameters of interest. We establish the consistency of the estimator of the mixing parameter a. An indication for asymptotic normality is given numerically. A simulation study shows that the proposed procedure improves empirical coefficients for the class of max-mixture models. In an analysis of monthly maxima of Australian daily rainfall data, we implement the proposed estimation procedure for diagnostic and confirmatory purposes. In the spatio-temporal part, based on a closed form expression of the spatio-temporal Fmadogram, we suggest a semi-parametric estimation methodology for space-time max-stable processes. This part provides a bridge between geostatistics and extreme value theory. In particular, for regular grid observations, the spatio-temporal F-madogram is estimated nonparametrically by its empirical version and a moment-based procedure is applied to obtain parameter estimates. The performance of the method is investigated through an extensive simulation study. Afterward, we apply this method to quantify the extremal behavior of radar daily rainfall maxima data from a region in the State of Florida. This approach could serve as an alternative or a prerequisite to pairwise likelihood estimation. Indeed, the semi-parametric estimates could be used as starting values for the optimization algorithm used to maximize the pairwise log-likelihood function in order to reduce the computational burden and also to improve the statistical efficiency Dépendance/Indépendance asymptotique Vraisemblance composite Événement extrême Fλ-madogramme Processus max-stable Processus max-mélange Précipitations Estimation semi-paramétrique Asymptotic dependence/independence Composite likelihood Extreme event Fλ- madogram Max-stable process Max-mixture process Rainfall data Semi-parametric estimation 510
17	Méthodes particulaires et vraisemblances pour l'inférence de modèles d'évolution avec dépendance au contexte / Sequential Monte Carlo methods and likelihoods for inference of context-dependent evolutionary models Huet, Alexis 27 June 2014 (has links) Cette thèse est consacrée à l'inférence de modèles stochastiques d'évolution de l'ADN avec dépendance au contexte, l'étude portant spécifiquement sur la classe de modèles stochastiques RN95+YpR. Cette classe de modèles repose sur un renforcement des taux d'occurrence de certaines substitutions en fonction du contexte local, ce qui introduit des phénomènes de dépendance dans l'évolution des différents sites de la séquence d'ADN. Du fait de cette dépendance, le calcul direct de la vraisemblance des séquences observées met en jeu des matrices de dimensions importantes, et est en général impraticable. Au moyen d'encodages spécifiques à la classe RN95+YpR, nous mettons en évidence de nouvelles structures de dépendance spatiales pour ces modèles, qui sont associées à l'évolution des séquences d'ADN sur toute leur histoire évolutive. Ceci rend notamment possible l'utilisation de méthodes numériques particulaires, développées dans le cadre des modèles de Markov cachés, afin d'obtenir des approximations consistantes de la vraisemblance recherchée. Un autre type d'approximation de la vraisemblance, basé sur des vraisemblances composites, est également introduit. Ces méthodes d'approximation de la vraisemblance sont implémentées au moyen d'un code en C++. Elles sont mises en œuvre sur des données simulées afin d'étudier empiriquement certaines de leurs propriétés, et sur des données génomiques, notamment à des fins de comparaison de modèles d'évolution / This thesis is devoted to the inference of context-dependent evolutionary models of DNA sequences, and is specifically focused on the RN95+YPR class of stochastic models. This class of models is based on the reinforcement of some substitution rates depending on the local context, which introduces dependence phenomena between sites in the evolution of the DNA sequence. Because of these dependencies, the direct computation of the likelihood of the observed sequences involves high-dimensional matrices, and is usually infeasible. Through encodings specific to the RN95+YpR class, we highlight new spatial dependence structures for these models, which are related to the evolution of DNA sequences throughout their evolutionary history. This enables the use of particle filter algorithms, developed in the context of hidden Markov models, in order to obtain consistent approximations of the likelihood. Another type of approximation of the likelihood, based on composite likelihoods, is also introduced. These approximation methods for the likelihood are implemented in a C++ program. They are applied on simulated data to empirically investigate some of their properties, and on genomic data, especially for comparison of evolutionary models Chaînes de Markov cachées Méthodes particulaires Filtre particulaire auxiliaire Vraisemblances composites Context-dependent evolutionary models Hidden Markov models Particle filter Auxiliary particule filter Composite likelihood methods 519.2
18	Composite Likelihood Estimation for Latent Variable Models with Ordinal and Continuous, or Ranking Variables Katsikatsou, Myrsini January 2013 (has links) The estimation of latent variable models with ordinal and continuous, or ranking variables is the research focus of this thesis. The existing estimation methods are discussed and a composite likelihood approach is developed. The main advantages of the new method are its low computational complexity which remains unchanged regardless of the model size, and that it yields an asymptotically unbiased, consistent, and normally distributed estimator. The thesis consists of four papers. The first one investigates the two main formulations of the unrestricted Thurstonian model for ranking data along with the corresponding identification constraints. It is found that the extra identifications constraints required in one of them lead to unreliable estimates unless the constraints coincide with the true values of the fixed parameters. In the second paper, a pairwise likelihood (PL) estimation is developed for factor analysis models with ordinal variables. The performance of PL is studied in terms of bias and mean squared error (MSE) and compared with that of the conventional estimation methods via a simulation study and through some real data examples. It is found that the PL estimates and standard errors have very small bias and MSE both decreasing with the sample size, and that the method is competitive to the conventional ones. The results of the first two papers lead to the next one where PL estimation is adjusted to the unrestricted Thurstonian ranking model. As before, the performance of the proposed approach is studied through a simulation study with respect to relative bias and relative MSE and in comparison with the conventional estimation methods. The conclusions are similar to those of the second paper. The last paper extends the PL estimation to the whole structural equation modeling framework where data may include both ordinal and continuous variables as well as covariates. The approach is demonstrated through an example run in R software. The code used has been incorporated in the R package lavaan (version 0.5-11). latent variable models factor analysis structural equation models Thurstonian model item response theory composite likelihood estimation pairwise likelihood estimation maximum likelihood weighted least squares ordinal variables ranking variables lavaan
19	Statistical Modeling for Credit Ratings Vana, Laura 01 August 2018 (has links) (PDF) This thesis deals with the development, implementation and application of statistical modeling techniques which can be employed in the analysis of credit ratings. Credit ratings are one of the most widely used measures of credit risk and are relevant for a wide array of financial market participants, from investors, as part of their investment decision process, to regulators and legislators as a means of measuring and limiting risk. The majority of credit ratings is produced by the "Big Three" credit rating agencies Standard & Poors', Moody's and Fitch. Especially in the light of the 2007-2009 financial crisis, these rating agencies have been strongly criticized for failing to assess risk accurately and for the lack of transparency in their rating methodology. However, they continue to maintain a powerful role as financial market participants and have a huge impact on the cost of funding. These points of criticism call for the development of modeling techniques that can 1) facilitate an understanding of the factors that drive the rating agencies' evaluations, 2) generate insights into the rating patterns that these agencies exhibit. This dissertation consists of three research articles. The first one focuses on variable selection and assessment of variable importance in accounting-based models of credit risk. The credit risk measure employed in the study is derived from credit ratings assigned by ratings agencies Standard & Poors' and Moody's. To deal with the lack of theoretical foundation specific to this type of models, state-of-the-art statistical methods are employed. Different models are compared based on a predictive criterion and model uncertainty is accounted for in a Bayesian setting. Parsimonious models are identified after applying the proposed techniques. The second paper proposes the class of multivariate ordinal regression models for the modeling of credit ratings. The model class is motivated by the fact that correlated ordinal data arises naturally in the context of credit ratings. From a methodological point of view, we extend existing model specifications in several directions by allowing, among others, for a flexible covariate dependent correlation structure between the continuous variables underlying the ordinal credit ratings. The estimation of the proposed models is performed using composite likelihood methods. Insights into the heterogeneity among the "Big Three" are gained when applying this model class to the multiple credit ratings dataset. A comprehensive simulation study on the performance of the estimators is provided. The third research paper deals with the implementation and application of the model class introduced in the second article. In order to make the class of multivariate ordinal regression models more accessible, the R package mvord and the complementary paper included in this dissertation have been developed. The mvord package is available on the "Comprehensive R Archive Network" (CRAN) for free download and enhances the available ready-to-use statistical software for the analysis of correlated ordinal data. In the creation of the package a strong emphasis has been put on developing a user-friendly and flexible design. The user-friendly design allows end users to estimate in an easy way sophisticated models from the implemented model class. The end users the package appeals to are practitioners and researchers who deal with correlated ordinal data in various areas of application, ranging from credit risk to medicine or psychology.

Search results