Return to search

Transcription factor binding dynamics and spatial co-localization in human genome

Transcription factor (TF) binding has been studied extensively in relation to binding site affinity and chromosome modifications; however, the relationship between genome spatial organisation and transcription factor binding is not well studied. Using the recently available high resolution Hi-C contact map of human GM12878 lymphoblastoid cells, we investigated computationally the genome-wide spatial co-localization of transcription factor binding sites, for both within the same type and between different types. First, we observed a strong positive correlation between site occupancy and homotypic TF co-localization based on Hi-C contacts, consistent with our predictions from biophysical simulations of TF target search. This trend is more prominent in binding sites with weak binding sequences and within enhancers, suggesting genome spatial organisation plays an essential role in determining binding site occupancy, especially for weak regulatory elements. Furthermore, when investigating spatial co-localization between different TFs, we discovered two distinct co-localization networks of TFs in lymphoblastoid cells, one of which is enriched in lymphocyte specific pathways and distal enhancer binding. These two TF networks have strong biases for either the A1 or A2 chromosome subcompartment, but nonetheless are still preserved within each, indicating a potential causal link between cell-type-specific transcription factor binding and chromosome subcompartment segregation. We called 40 pairs of significantly co-localized TFs according to the genome wide Hi-C contact map, which are enriched in previously reported, physical interactions, thus linking TF spatial network to co-functioning. In addition to the above main project, I also worked on a side project to find compute-efficient ways in scaling binding site strength across different TFs based on Position-Weight-Matrices (PWM). While common bioinformatics tools produce scores that can reflect the binding strength between a specific TF and the DNA, these scores are not directly comparable between different TFs. We provided two approaches in estimating a scaling parameter $\lambda$ to the PWM score for different TFs. The first approach uses a PWM and background genomic sequence as input to estimate $\lambda$ for a specific TF, which we applied to show that $\lambda$ distributions for different TF families correspond with their DNA binding properties. Our second method can reliably convert $\lambda$ between different PWMs of the same TF, which allows us to directly compare PWMs that were generated by different approaches.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:744314
Date January 2017
CreatorsMa, Xiaoyan
ContributorsAdryan, Boris
PublisherUniversity of Cambridge
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttps://www.repository.cam.ac.uk/handle/1810/269532

Page generated in 0.0031 seconds