1 |
Programming language semantics as a foundation for Bayesian inferenceSzymczak, Marcin January 2018 (has links)
Bayesian modelling, in which our prior belief about the distribution on model parameters is updated by observed data, is a popular approach to statistical data analysis. However, writing specific inference algorithms for Bayesian models by hand is time-consuming and requires significant machine learning expertise. Probabilistic programming promises to make Bayesian modelling easier and more accessible by letting the user express a generative model as a short computer program (with random variables), leaving inference to the generic algorithm provided by the compiler of the given language. However, it is not easy to design a probabilistic programming language correctly and define the meaning of programs expressible in it. Moreover, the inference algorithms used by probabilistic programming systems usually lack formal correctness proofs and bugs have been found in some of them, which limits the confidence one can have in the results they return. In this work, we apply ideas from the areas of programming language theory and statistics to show that probabilistic programming can be a reliable tool for Bayesian inference. The first part of this dissertation concerns the design, semantics and type system of a new, substantially enhanced version of the Tabular language. Tabular is a schema-based probabilistic language, which means that instead of writing a full program, the user only has to annotate the columns of a schema with expressions generating corresponding values. By adopting this paradigm, Tabular aims to be user-friendly, but this unusual design also makes it harder to define the syntax and semantics correctly and reason about the language. We define the syntax of a version of Tabular extended with user-defined functions and pseudo-deterministic queries, design a dependent type system for this language and endow it with a precise semantics. We also extend Tabular with a concise formula notation for hierarchical linear regressions, define the type system of this extended language and show how to reduce it to pure Tabular. In the second part of this dissertation, we present the first correctness proof for a Metropolis-Hastings sampling algorithm for a higher-order probabilistic language. We define a measure-theoretic semantics of the language by means of an operationally-defined density function on program traces (sequences of random variables) and a map from traces to program outputs. We then show that the distribution of samples returned by our algorithm (a variant of “Trace MCMC” used by the Church language) matches the program semantics in the limit.
|
2 |
Classificação supervisionada com programação probabilísticaLucena, Danilo Carlos Gouveia de 10 February 2014 (has links)
Made available in DSpace on 2015-05-14T12:36:45Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 606852 bytes, checksum: 6a982febbce62a2525ee58de6e011a23 (MD5)
Previous issue date: 2014-02-10 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Probabilistic inference mechanisms are at the intersection of three main areas: statistics,
programming languages and probability. These mechanisms are used to create probabilistic
models and assist in treating uncertainties. Probabilistic programming languages assist
in high-level description of these models. These languages facilitate the development
of the models because they abstract the inference mechanisms at the lower levels, allow
reuse of code, and assist in results analysis. This study proposes the analysis of inference
engines implemented by probabilistic programming languages and presents a case study of
a supervised text classifier using probabilistic programming. / Mecanismos de inferência probabilísticos estão na intersecção de três áreas: estatística,
linguagens de programação e sistemas de probabilidade. Esses mecanismos são utilizados
para criar modelos probabilísticos e auxiliam no tratamento de incertezas. As linguagens de
programação probabilísticas auxiliam na descrição de alto nível desses tipos de modelos.
Essas linguagens facilitam o desenvolvimento abstraindo os mecanismos de inferência de
mais baixo nível, favorecem o reuso de código e auxiliam na análise dos resultados. Este
estudo propõe a análise dos mecanismos de inferência implementados pelas linguagens de
programação probabilísticas e apresenta um estudo de caso com a implementação de um
classificador supervisionado de textos com programação probabilística.
|
3 |
Évaluation quantitative de séquences d’événements en sûreté de fonctionnement à l’aide de la théorie des langages probabilistes / Quantitative assessment of events sequences in dependability studies, based on probabilistic languages theoryIonescu, Dorina-Romina 21 November 2016 (has links)
Les études de sûreté de fonctionnement (SdF) sont en général basées sur l’hypothèse d’indépendance des événements de défaillance et de réparation ainsi que sur l’analyse des coupes qui décrivent les sous-ensembles de composants entraînant la défaillance du système. Dans le cas des systèmes dynamiques pour lesquels l’ordre d’occurrence des événements a une incidence directe sur le comportement dysfonctionnel du système, il est important de privilégier l’utilisation de séquences d’événements permettant une évaluation des indicateurs de SdF plus précise que les coupes. Ainsi, nous avons proposé, dans une première partie de nos travaux, un cadre formel permettant la détermination des séquences d’événements qui décrivent l’évolution du système ainsi que leur évaluation quantitative, en recourant à la théorie de langages probabilistes et à la théorie des processus markoviens/semi-markoviens. L'évaluation quantitative des séquences intègrent le calcul de leur probabilité d'occurrence ainsi que leur criticité (coût et longueur des séquences). Pour l’évaluation des séquences décrivant l’évolution des systèmes complexes présentant plusieurs modes de fonctionnement ou de défaillance, une approche modulaire basée sur les opérateurs de composition (choix et concaténation) a été proposée. Celle-ci consiste à calculer la probabilité d'une séquence d'événements globale à partir d'évaluations réalisées localement, mode par mode. Les différentes contributions sont appliquées sur deux cas d'étude de taille et complexité croissante. / Dependability studies are often based on the assumption of events (failures and repairs) independence but also on the analyse of cut-set which describes the subsets of components causing a system failure. In the case of dynamic systems where the events occurrence order has a direct impact on the dysfunctional behaviour, it is important to promote using event sequences instead of cut-sets for dependability assessment. In the first part, a formal framework is proposed. It helps in determining sequences of events that describe the evolution of the system and their assessment, using the theory of probabilistic languages and the theory of Markov/semi-Markov processes. The assessment integrates the calculation of the probability occurrence of the event sequences and their criticality (cost and length). For the assessment of complex systems with multiple operating/failure modes, a modular approach based on composition operators (choice and concatenation) is proposed. Evaluation of the probability of a global sequence of events is performed from local Markov/semi-Markov models for each mode of the system. The different contributions are applied on two case studies with a growing complexity.
|
Page generated in 0.1203 seconds