1 |
New Algorithms for Mining Network Datasets: Applications to Phenotype and Pathway ModelingJin, Ying 22 January 2010 (has links)
Biological network data is plentiful with practically every experimental methodology giving 'network views' into cellular function and behavior. Bioinformatic screens that yield network data include, for example, genome-wide deletion screens, protein-protein interaction assays, RNA interference experiments, and methods to probe metabolic pathways. Efficient and comprehensive computational approaches are required to model these screens and gain insight into the nature of biological networks. This thesis presents three new algorithms to model and mine network datasets. First, we present an algorithm that models genome-wide perturbation screens by deriving relations between phenotypes and subsequently using these relations in a local manner to derive genephenotype relationships. We show how this algorithm outperforms all previously described algorithms for gene-phenotype modeling. We also present theoretical insight into the convergence and accuracy properties of this approach. Second, we define a new data mining problem–constrained minimal separator mining—and propose algorithms as well as applications to modeling gene perturbation screens by viewing the perturbed genes as a graph separator. Both of these data mining applications are evaluated on network datasets from S. cerevisiae and C. elegans. Finally, we present an approach to model the relationship between metabolic pathways and operon structure in prokaryotic genomes. In this approach, we present a new pattern class—biclusters over domains with supplied partial orders—and present algorithms for systematically detecting such biclusters. Together, our data mining algorithms provide a comprehensive arsenal of techniques for modeling gene perturbation screens and metabolic pathways. / Ph. D.
|
Page generated in 0.116 seconds