Return to search

The demarcation of transcription factor binding sites through the analysis of DNase-seq data

The expression of eukaryotic genes is controlled by non-coding regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins (transcription factors). In multicellular organisms, the characterisation of these elements is required in order to understand how a single genome is utilised to generate a multitude of cell types, and how aberrant regulation of transcription contributes to disease processes. This involves the identification of transcription factor binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase I and the subsequent analysis of regions protected from digestion followed by high-throughput sequencing (DNase-seq footprinting), allows for the quantification of genome-wide transcription factor binding. However, the handful of methods for analysing DNase-seq data has not been extensively validated or benchmarked. This thesis describes a novel footprinting algorithm, Wellington, which is presented in the context of a comprehensive comparison of several other DNase-seq footprinting algorithms on a multitude of datasets. Wellington outperforms other methods in almost all situations. An open-source software package, pyDNase, that facilitates interacting with DNase-seq data and provides many tools for DNase-seq analysis is also presented. Wellington is used to perform footprinting on clinical samples to validate cell lines as a model system, and to identify the binding partners of the RUNX1/ETO fusion protein in t(8;21) AML. By expanding the Wellington method, differential footprinting is shown to be able to link differences in transcription factor binding at promoters to changes in gene expression. Applying this methodology to a range of haematopoietic cell types illustrates the ability for differential footprinting to identify key regulators in the haematopoietic lineage. These results represent advances in the methods available to analyse DNase-seq data (all of which have been released as free, opensource software) and demonstrate the power of integrating DNase-seq footprinting with other functional genomic assays to study transcriptional regulation.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:665324
Date January 2014
CreatorsPiper, Jason
PublisherUniversity of Warwick
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://wrap.warwick.ac.uk/71314/

Page generated in 0.0024 seconds