Return to search

ALGORITHMS FOR DISCOVERY OF MULTIPLE MARKOV BOUNDARIES: APPLICATION TO THE MOLECULAR SIGNATURE MULTIPLICITY PROBLEM

Algorithms for discovery of a Markov boundary from data constitute one of the most important recent developments in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight about local causal structure. Even though there is always a single Markov boundary of the response variable in faithful distributions, distributions with violations of the intersection property may have multiple Markov boundaries. Such distributions are abundant in practical data-analytic applications, and there are several reasons why it is important to induce all Markov boundaries from such data. However, there are currently no practical algorithms that can provably accomplish this task. To this end, I propose a novel generative algorithm (termed TIE*) that can discover all Markov boundaries from data. The generative algorithm can be instantiated to discover Markov boundaries independent of data distribution. I prove correctness of the generative algorithm and provide several admissible instantiations. The new algorithm is then applied to identify the set of maximally predictive and non-redundant molecular signatures. TIE* identifies exactly the set of true signatures in simulated distributions and yields signatures with significantly better predictivity and reproducibility than prior algorithms in human microarray gene expression datasets. The results of this thesis also shed light on the causes of molecular signature multiplicity phenomenon.

Identiferoai:union.ndltd.org:VANDERBILT/oai:VANDERBILTETD:etd-12042008-121803
Date06 December 2008
CreatorsStatnikov, Alexander Romanovich
ContributorsConstantin F. Aliferis, Gregory F. Cooper, Douglas P. Hardin, Daniel R. Masys, Ioannis Tsamardinos
PublisherVANDERBILT
Source SetsVanderbilt University Theses
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.library.vanderbilt.edu//available/etd-12042008-121803/
Rightsunrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to Vanderbilt University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.

Page generated in 0.002 seconds