Return to search

Computational detection of tissue-specific cis-regulatory modules

A cis-regulatory module (CRM) is a DNA region of a few hundred base pairs that consists of clustering of several transcription factor binding sites and regulates the expression of a nearby gene. This thesis presents a new computational approach to CRM detection. / It is believed that tissue-specific CRMs tend to regulate nearby genes in a certain tissue and that they consist of binding sites for transcription factors (TFs) that are also expressed in that tissue. These facts allow us to make use of tissue-specific gene expression data to detect tissue-specific CRMs and improve the specificity of module prediction. / We build a Bayesian network to integrate the sequence information about TF binding sites and the expression information about TFs and regulated genes. The network is then used to infer whether a given genomic region indeed has regulatory activity in a given tissue. A novel EM algorithm incorporating probability tree learning is proposed to train the Bayesian network in an unsupervised way. A new probability tree learning algorithm is developed to learn the conditional probability distribution for a variable in the network that has a large number of hidden variables as its parents. / Our approach is evaluated using biological data, and the results show that it is able to correctly discriminate among human liver-specific modules, erythroid-specific modules, and negative-control regions, even though no prior knowledge about the TFs and the target genes is employed in our algorithm. In a genome-wide scale, our network is trained to identify tissue-specific CRMs in ten tissues. Some known tissue-specific modules are rediscovered, and a set of novel modules are predicted to be related with tissue-specific expression.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:QMM.97927
Date January 2006
CreatorsChen, Xiaoyu, 1974-
PublisherMcGill University
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Formatapplication/pdf
CoverageMaster of Science (School of Computer Science.)
Rights© Xiaoyu Chen, 2006
Relationalephsysno: 002484004, proquestno: AAIMR24638, Theses scanned by UMI/ProQuest.

Page generated in 0.017 seconds