Return to search

A Bayesian Local Causal Discovery Framework

This work introduces the Bayesian local causal discovery framework, a method for discovering unconfounded
causal relationships from observational data. It addresses the hypothesis that causal discovery using
local search methods will outperform causal discovery algorithms that employ global search in the
context of large datasets and limited computational resources.
Several Bayesian local causal discovery (BLCD) algorithms are described and results presented comparing them
with two well-known global causal discovery algorithms PC and FCI, and a global Bayesian network
learning algorithm, the optimal reinsertion (OR) algorithm which was post-processed to
identify relationships that under assumptions are causal.
Methodologically, this research formalizes the task of
causal discovery from observational data using a Bayesian
approach and local search. It specifically investigates the
so called Y structure in causal discovery and
classifies the various types of Y structures
present in the data generating networks. It
identifies the Y structures in the Alarm,
Hailfinder, Barley, Pathfinder and Munin networks and
categorizes them. A proof of the convergence of the BLCD
algorithm based on the identification of Y structures, is
also provided. Principled methods of combining
global and local causal discovery algorithms to improve upon
the performance of the individual algorithms are discussed. In particular,
a post-processing method for identifying plausible causal relationships from the output of global Bayesian
network learning algorithms is described, thereby
extending them to be causal discovery algorithms.
In an experimental evaluation, simulated data from
synthetic causal Bayesian networks representing five
different domains, as well as a real-world medical dataset, were used.
Causal discovery performance was measured using precision and recall.
Sometimes the local methods performed better than the global methods,
and sometimes they did not (both in terms of precision/recall
and in terms of computation time).
When all the datasets were considered in aggregate,
the local methods (BLCD and BLCDpk) had higher precision.
The general performance of the BLCD class of algorithms
was comparable to the global search algorithms, implying that the local
search algorithms will have good performance on
very large datasets when the global methods fail to scale
up. The limitations of this research and directions for
future research are also discussed.

Identiferoai:union.ndltd.org:PITT/oai:PITTETD:etd-12082005-122145
Date30 March 2006
CreatorsMani, Subramani
ContributorsGregory F. Cooper, Michael M. Wagner, Bruce G. Buchanan, Peter Spirtes
PublisherUniversity of Pittsburgh
Source SetsUniversity of Pittsburgh
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.library.pitt.edu/ETD/available/etd-12082005-122145/
Rightsunrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to University of Pittsburgh or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.

Page generated in 0.0015 seconds