Return to search

Inferring the Binding Preferences of RNA-binding Proteins

Post-transcriptional regulation is carried out by RNA-binding proteins (RBPs) that bind to specific RNA molecules and control their processing, localization, stability and degradation. Experimental studies have successfully identified RNA targets associated with specific RBPs. However, because the locations of the binding sites within the targets are unknown and because RBPs recognize both sequence and structure elements in their binding sites, identification of RBP binding preferences from these data remains challenging.

The unifying theme of this thesis is to identify RBP binding preferences from experimental data. First, we propose a protocol to design a complex RNA pool that represents diverse sets of sequence and structure elements to be used in an in vitro assay to efficiently measure RBP binding preferences. This design has been implemented in the RNAcompete method, and applied genome-wide to human and Drosophila RBPs. We show that RNAcompete-derived motifs are consistent with established binding preferences.

We developed two computational models to learn binding preferences of RBPs from large-scale data. Our first model, RNAcontext uses a novel representation of secondary structure to infer both sequence and structure preferences of RBPs, and is optimized for use with in vitro binding data on short RNA sequences. We show that including structure information improves the prediction accuracy significantly. Our second model, MaLaRKey, extends RNAcontext to fit motif models to sequences of arbitrary length, and to incorporate a richer set of structure features to better model in vivo RNA secondary structure. We demonstrate that MaLaRKey infers detailed binding models that accurately predict binding of full-length transcripts.

Identiferoai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/34077
Date17 December 2012
CreatorsHilal, Kazan
ContributorsQuaid, Morris
Source SetsUniversity of Toronto
Languageen_ca
Detected LanguageEnglish
TypeThesis

Page generated in 0.002 seconds