Spelling suggestions: "subject:"markov equivalence"" "subject:"markov quivalence""
1 |
Causal Reasoning in Equivalence ClassesAmin Jaber (14227610) 07 December 2022 (has links)
<p>Causality is central to scientific inquiry across many disciplines including epidemiology, medicine, and economics, to name a few. Researchers are usually interested not only in knowing how two events are correlated, but also in whether one causes the other and, if so, how. In general, the scientific practice seeks not just a surface description of the observed data, but rather deeper explanations, such as predicting the effects of interventions. The answer to such questions does not lie in the data alone and requires a qualitative understanding of the underlying data-generating process; a knowledge that is articulated in a causal diagram.</p>
<p>And yet, delineating the true, underlying causal diagram requires knowledge and assumptions that are usually not available in many non-trivial and large-scale situations. Hence, this dissertation develops necessary theory and algorithms towards realizing a data-driven framework for causal inference. More specifically, this work provides fundamental treatments of the following research questions:</p>
<p><br></p>
<p><strong>Effect Identification under Markov Equivalence.</strong> One common task in many data sciences applications is to answer questions about the effect of new interventions, like: 'what would happen to <em>Y</em> while observing <em>Z=z</em> if we force <em>X</em> to take the value <em>x</em>?'. Formally, this is known as <em>causal effect identification</em>, where the goal is to determine whether a post-interventional distribution is computable from the combination of an observational distribution and assumptions about the underlying domain represented by a causal diagram. In this dissertation, we assume as the input of the task a less informative structure known as a partial ancestral graph (PAG), which represents a Markov equivalence class of causal diagrams, learnable from observational data. We develop tools and algorithms for this relaxed setting and characterize identifiable effects under necessary and sufficient conditions.</p>
<p><br></p>
<p><strong>Causal Discovery from Interventions.</strong> A causal diagram imposes constraints on the corresponding generated data; conditional independences are one such example. Given a mixture of observational and experimental data, the goal is to leverage the constraints imprinted in the data to infer the set of causal diagrams that are compatible with such constraints. In this work, we consider soft interventions, such that the mechanism of an intervened variable is modified without fully eliminating the effect of its direct causes, and investigate two settings where the targets of the interventions could be known or unknown to the data scientist. Accordingly, we introduce the first general graphical characterizations to test whether two causal diagrams are indistinguishable given the constraints in the available data. We also develop algorithms that, given a mixture of observational and interventional data, learn a representation of the equivalence class.</p>
|
2 |
Détection de ruptures multiples dans des séries temporelles multivariées : application à l'inférence de réseaux de dépendance / Multiple change-point detection in multivariate time series : application to the inference of dependency networksHarlé, Flore 21 June 2016 (has links)
Cette thèse présente une méthode pour la détection hors-ligne de multiples ruptures dans des séries temporelles multivariées, et propose d'en exploiter les résultats pour estimer les relations de dépendance entre les variables du système. L'originalité du modèle, dit du Bernoulli Detector, réside dans la combinaison de statistiques locales issues d'un test robuste, comparant les rangs des observations, avec une approche bayésienne. Ce modèle non paramétrique ne requiert pas d'hypothèse forte sur les distributions des données. Il est applicable sans ajustement à la loi gaussienne comme sur des données corrompues par des valeurs aberrantes. Le contrôle de la détection d'une rupture est prouvé y compris pour de petits échantillons. Pour traiter des séries temporelles multivariées, un terme est introduit afin de modéliser les dépendances entre les ruptures, en supposant que si deux entités du système étudié sont connectées, les événements affectant l'une s'observent instantanément sur l'autre avec une forte probabilité. Ainsi, le modèle s'adapte aux données et la segmentation tient compte des événements communs à plusieurs signaux comme des événements isolés. La méthode est comparée avec d'autres solutions de l'état de l'art, notamment sur des données réelles de consommation électrique et génomiques. Ces expériences mettent en valeur l'intérêt du modèle pour la détection de ruptures entre des signaux indépendants, conditionnellement indépendants ou complètement connectés. Enfin, l'idée d'exploiter les synchronisations entre les ruptures pour l'estimation des relations régissant les entités du système est développée, grâce au formalisme des réseaux bayésiens. En adaptant la fonction de score d'une méthode d'apprentissage de la structure, il est vérifié que le modèle d'indépendance du système peut être en partie retrouvé grâce à l'information apportée par les ruptures, estimées par le modèle du Bernoulli Detector. / This thesis presents a method for the multiple change-points detection in multivariate time series, and exploits the results to estimate the relationships between the components of the system. The originality of the model, called the Bernoulli Detector, relies on the combination of a local statistics from a robust test, based on the computation of ranks, with a global Bayesian framework. This non parametric model does not require strong hypothesis on the distribution of the observations. It is applicable without modification on gaussian data as well as data corrupted by outliers. The detection of a single change-point is controlled even for small samples. In a multivariate context, a term is introduced to model the dependencies between the changes, assuming that if two components are connected, the events occurring in the first one tend to affect the second one instantaneously. Thanks to this flexible model, the segmentation is sensitive to common changes shared by several signals but also to isolated changes occurring in a single signal. The method is compared with other solutions of the literature, especially on real datasets of electrical household consumption and genomic measurements. These experiments enhance the interest of the model for the detection of change-points in independent, conditionally independent or fully connected signals. The synchronization of the change-points within the time series is finally exploited in order to estimate the relationships between the variables, with the Bayesian network formalism. By adapting the score function of a structure learning method, it is checked that the independency model that describes the system can be partly retrieved through the information given by the change-points, estimated by the Bernoulli Detector.
|
Page generated in 0.0481 seconds