Predicting unknown and unobserved events is a common task in many domains. Mathematically, the uncertainties arising in such prediction tasks can be described by probabilistic predictive models. Ideally, the model estimates of these uncertainties allow us to distinguish between uncertain and trustworthy predictions. This distinction is particularly important in safety-critical applications such as medical image analysis and autonomous driving. For the probabilistic predictions to be meaningful and to allow this differentiation, they should neither be over- nor underconfident. Models that satisfy this property are called calibrated. In this thesis we study how one can measure, estimate, and statistically reason about the calibration of probabilistic predictive models. In Paper I we discuss existing approaches for evaluating calibration in multi-class classification. We mention potential pitfalls and suggest hypothesis tests for the statistical analysis of model calibration. In Paper II we propose a framework of calibration measures for multi-class classification. It captures common existing measures and includes a new kernel calibration error based on matrix-valued kernels. For the kernel calibration error consistent and unbiased estimators exist and asymptotic hypothesis tests for calibration can be derived. Unfortunately, by construction the framework is limited to prediction problems with finite discrete target spaces. In Paper III we use a different approach to develop a more general framework of calibration errors that applies to any probabilistic predictive model and is not limited to classification. We show that it coincides with the framework presented in Paper II for multi-class classification. Based on scalar-valued kernels, we generalize the kernel calibration error, its estimators, and hypothesis tests to all probabilistic predictive models. For real-valued regression problems we present empirical results.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-429418 |
Date | January 2020 |
Creators | Widmann, David |
Publisher | Uppsala universitet, Avdelningen för systemteknik, Uppsala universitet, Reglerteknik, Uppsala |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Licentiate thesis, comprehensive summary, info:eu-repo/semantics/masterThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | IT licentiate theses / Uppsala University, Department of Information Technology, 1404-5117 ; 2020-006 |
Page generated in 0.002 seconds