Global ETD Search

Return to search

Towards a Complete Transcriptional Regulatory Code: Improved Motif Discovery Using Informative Priors

Transcriptional regulation is the primary mechanism employed by the cell to ensure coordinated expression of its numerous genes. A key component of this process is the binding of proteins called transcription factors (TFs) to corresponding regulatory sites on the DNA. Understanding where exactly these TFs bind, under what conditions they are active, and which genes they regulate is all part of deciphering the transcriptional regulatory code. An important step towards solving this problem is the identification of DNA binding specificities, represented as motifs, for all TFs. In spite of an explosion of TF binding data from high-throughput technologies, the
problem of motif discovery remains unsolved, due to the short length and degeneracy of binding sites. We introduce PRIORITY, a Gibbs sampling-based approach, which incorporates informative positional priors into a probabilistic framework, to find significant motifs from high-throughput TF binding data. We use different data sources to build our
positional priors and apply them to yeast ChIP-chip data: * TFs can be classified into several structural classes based on their DNA-binding domains. Using a Bayesian learning algorithm, we show that it is possible to predict the class of a TF with remarkable accuracy, using information solely from its DNA binding sites. We further incorporate these results in the form of informative priors into PRIORITY, which learns the structural class of the TF in addition to its motif. * In the nucleus, DNA is present in the form of chromatin--wrapped around nucleosomes--with certain regions being more accessible to TFs than others. It has been shown that functional binding sites are generally located in nucleosome-free regions. We use nucleosome occupancy predictions to compute a novel positional prior that biases the search towards the more accessible regions, thereby enriching the motif signal.* Functional elements are often conserved across related species. Most conventional methods that exploit this fact use alignments. However, multiple alignments cannot always capture relocation and reversed orientation of binding sites across species. We propose a new alignment-free technique that not only accounts for these transformations, but is much faster than conventional methods. All our priors significantly outperform conventional methods, finding motifs matching literature for 52 TFs. We produce a genome-wide map of TF binding sites in yeast based on these and other novel motif predictions. / Dissertation

http://hdl.handle.net/10161/655

Computer Science

Identifer	oai:union.ndltd.org:DUKE/oai:dukespace.lib.duke.edu:10161/655
Date	24 April 2008
Creators	Narlikar, Leelavati
Contributors	Hartemink, Alexander J
Source Sets	Duke University
Language	en_US
Detected Language	English
Type	Dissertation
Format	14605039 bytes, application/pdf

Page generated in 0.0026 seconds

Towards a Complete Transcriptional Regulatory Code: Improved Motif Discovery Using Informative Priors

Description

Links & Downloads

Tags

Additional Fields