A cis-regulatory module (CRM) is a DNA region of a few hundred base pairs that consists of clustering of several transcription factor binding sites and regulates the expression of a nearby gene. This thesis presents a new computational approach to CRM detection. / It is believed that tissue-specific CRMs tend to regulate nearby genes in a certain tissue and that they consist of binding sites for transcription factors (TFs) that are also expressed in that tissue. These facts allow us to make use of tissue-specific gene expression data to detect tissue-specific CRMs and improve the specificity of module prediction. / We build a Bayesian network to integrate the sequence information about TF binding sites and the expression information about TFs and regulated genes. The network is then used to infer whether a given genomic region indeed has regulatory activity in a given tissue. A novel EM algorithm incorporating probability tree learning is proposed to train the Bayesian network in an unsupervised way. A new probability tree learning algorithm is developed to learn the conditional probability distribution for a variable in the network that has a large number of hidden variables as its parents. / Our approach is evaluated using biological data, and the results show that it is able to correctly discriminate among human liver-specific modules, erythroid-specific modules, and negative-control regions, even though no prior knowledge about the TFs and the target genes is employed in our algorithm. In a genome-wide scale, our network is trained to identify tissue-specific CRMs in ten tissues. Some known tissue-specific modules are rediscovered, and a set of novel modules are predicted to be related with tissue-specific expression.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:QMM.97927 |
Date | January 2006 |
Creators | Chen, Xiaoyu, 1974- |
Publisher | McGill University |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Format | application/pdf |
Coverage | Master of Science (School of Computer Science.) |
Rights | © Xiaoyu Chen, 2006 |
Relation | alephsysno: 002484004, proquestno: AAIMR24638, Theses scanned by UMI/ProQuest. |
Page generated in 0.017 seconds