1 |
Investigation of transcription factor binding at distal regulatory elementsMitchelmore, Joanna January 2018 (has links)
Cellular development and function necessitate precise patterns of gene expression. Control of gene expression is in part orchestrated by a class of remote regulatory elements, termed enhancers, which are brought into contact with promoters via DNA looping. Enhancers typically contain clusters of transcription factor binding sites, and TF recruitment to them is thought to play a key role in transcriptional control. In this thesis I have addressed two issues regarding gene regulation by enhancers. First, with recent genome-wide enhancer mapping, it is becoming increasingly apparent that genes are commonly regulated by multiple enhancers in the same cell type. How a gene’s regulatory information is encoded across multiple enhancers, however, is still not fully understood. Second, numerous recent studies have found that enhancers are enriched for expression-modulating and disease-associated genetic variants. However, understanding and predicting the effects of enhancer variants remains a major challenge. I focussed on a human lymphoblastoid cell line (LCL), GM12878, for which ChIP-Seq data are available for 52 different TFs from the ENCODE project. Significantly, Promoter Capture Hi-C data for the same LCL are available, making it possible to link enhancers to target genes globally. In the first part of the thesis, I investigated how gene regulatory information is encoded across enhancers. Specifically, I asked whether a gene tends to use multiple enhancers to bring the same or distinct regulatory information. I found that there was a general trend towards a “shadow” enhancer architecture, whereby similar combinations of TFs were recruited to multiple enhancers. However, numerous examples of “integrating” enhancers were observed, where the same gene showed large variation in TF binding across enhancers. Distinct groups of TFs were associated with these contrasting models of TF enhancer binding. To investigate the functional effects of variation at enhancers, I additionally took advantage of a panel of LCLs derived from 359 individuals, which have been genotyped by the 1000 Genomes Project, and for which RNA-Seq data are publically available. I used TF binding models to computationally predict variants impacting TF binding, and tested the association of these variants with the expression of the target genes they contact based on Promoter Capture Hi-C. Compared to the standard eQTL calling approach, this offers increased sensitivity as only variants physically contacting the promoter and predicted to impact TF binding are tested. Using this approach, I discovered a set of predicted TF-binding affinity variants at distal regions that associate with gene expression. Interestingly, a large proportion of these binding variants fall at the promoters of other genes. This finding suggests that some promoters may be able to act in an enhancer-like manner via long-range interactions, consistent with very recent findings from alternative approaches.
|
2 |
Regulatory Elements and Gene Expression in Primates and Diverse Human Cell-typesSheffield, Nathan January 2013 (has links)
<p>After finishing a human genome reference sequence in 2002, the genomics community has</p><p>turned to the task of interpreting it. A primary focus is to identify and characterize not only</p><p>protein-coding genes, but all functional elements in the genome. The effort has identified</p><p>millions of regulatory elements across species and in hundreds of human cell-types. Nearly</p><p>all identified regulatory elements are found in non-coding DNA, hypothesizing a function</p><p>for previously unannotated sequence. The ability to identify regulatory DNA genome-wide</p><p>provides a new opportunity to understand gene regulation and to ask fundamental questions</p><p>in diverse areas of biology.</p><p>One such area is the aim to understand the molecular basis for phenotypic differences</p><p>between humans and other primates. These phenotypic differences are partially driven</p><p>by mutations in non-coding regulatory DNA that alter gene expression. This hypothesis</p><p>has been supported by differential gene expression analyses in general, but we have not</p><p>yet identified specific regulatory variants responsible for differences in transcription and</p><p>phenotype. I have worked to identify regulatory differences in the same cell-type isolated</p><p>from human, chimpanzee, and macaque. Most regulatory elements were conserved among</p><p>all three species, as expected based on their central role in regulating transcription. How-</p><p>ever, several hundred regulatory elements were gained or lost on the lineages leading to</p><p>modern human and chimpanzee. Species-specific regulatory elements are enriched near</p><p>differentially expressed genes, are positively correlated with increased transcription, show</p><p>evidence of branch-specific positive selection, and overlap with active chromatin marks.</p><p>ivSpecies-specific sequence differences in transcription factor motifs found within this regu-</p><p>latory DNA are linked with species-specific changes in chromatin accessibility. Together,</p><p>these indicate that species-specific regulatory elements contribute to transcriptional and</p><p>phenotypic differences among primate species.</p><p>Another fundamental function of regulatory elements is to define different cell-types in</p><p>multicellular organisms. Regulatory elements recruit transcription factors that modulate</p><p>gene expression distinctly across cell-types. In a study of 112 human cell-types, I classified</p><p>regulatory elements into clusters based on regulatory signal tissue specificity. I then used</p><p>these to uncover distinct associations between regulatory elements and promoters, CpG-</p><p>islands, conserved elements, and transcription factor motif enrichment. Motif analysis</p><p>identified known and novel transcription factor binding motifs in cell-type-specific and</p><p>ubiquitous regulatory elements. I also developed a classifier that accurately predicts cell-</p><p>type lineage based on only 43 regulatory elements and evaluated the tissue of origin for</p><p>cancer cell-types. By correlating regulatory signal and gene expression, I predicted target</p><p>genes for more than 500k regulatory elements. Finally, I introduced a web resource to</p><p>enable researchers to explore these regulatory patterns and better understand how expression</p><p>is modulated within and across human cell-types.</p><p>Regulation of gene expression is fundamental to life. This dissertation uses identified</p><p>regulatory DNA to better understand regulatory systems. In the context of either evolution-</p><p>ary or developmental biology, understanding how differences in regulatory DNA contribute</p><p>to phenotype will be central to completely understanding human biology.</p> / Dissertation
|
3 |
Evolutionary patterns of group B Sox binding and function in DrosophilaCarl, Sarah Hamilton January 2015 (has links)
Genome-wide binding and expression studies in Drosophila melanogaster have revealed widespread roles for Dichaete and SoxNeuro, two group B Sox proteins, during fly development. Although they have distinct target genes, these two transcription factors bind in very similar patterns across the genome and can partially compensate for each other's loss, both phenotypically and at the level of DNA binding. However, the inherent noise in genome-wide binding studies as well as the high affinity of transcription factors for DNA and the potential for non-specific binding makes it difficult to identify true functional binding events. Additionally, externalfactors such as chromatin accessibility are known to play a role in determining binding patterns in Drosophila. A comparative approach to transcription factor binding facilitates the use of evolutionary conservation to identify functional features of binding patterns. In order to discover highly conserved features of group B Sox binding, I performed DamID-seq for SoxNeuro and Dichaete in four species of Drosophila, D. melanogaster, D. simulans, D. yakuba and D. pseudoobscura. I also performed FAIRE-seq in D. pseudoobscura embryos to compare the chromatin accessibility landscape between two fly species and to examine the relationship between open chromatin and group B Sox binding. I found that, although the sequences, expression patterns and overall transcriptional regulatory targets of Dichaete and SoxNeuro are highly conserved across the drosophilids, both binding site turnover and rates of quantitative binding divergence between species increase with phylogenetic distance. Elevated rates of binding conservation can be found at bound genomic intervals overlapping functional sites, including known enhancers, direct targets of Dichaete and SoxNeuro, and core binding intervals identified in previous genome-wide studies. Sox motifs identified in intervals that show binding conservation are also more highly conserved than those in intervals that are only bound in one species. Notably, regions that are bound in common by SoxNeuro and Dichaete are more likely to be conserved between species than those bound by one protein alone. However, by examining binding intervals that are uniquely bound by one protein and conserved, I was able to identify distinctive features of the targets of each transcription factor that point to unique aspects of their functions. My comparative analysis of group B Sox binding suggests that sites that are commonly bound by Dichaete and SoxNeuro, primarily at targets in the developing nervous system, are highlyconstrained by natural selection. Uniquely bound targets have different tissue expression profiles, leading me to propose a model whereby the unique functions of Dichaete and SoxNeuro may arise from a combination of differences in their own expression patterns and the broader nuclear environment, including tissue-specific cofactors and patterns of accessible chromatin. These results shed light on the evolutionary forces that have maintained conservation of the complex functional relationships between group B Sox proteins from insects to mammals.
|
4 |
Elucidating transcription factor regulation by TCDD within the hs1,2 enhancerOchs, Sharon D. 13 April 2012 (has links)
No description available.
|
5 |
Development of bioinformatics methods for the analysis of large collections of transcription factor binding motifs : positional motif enrichment and motif clustering / Développement de méthodes bioinformatiques pour l'analyse de collections massives de motifs de liaison pour des facteurs transcriptionnels : enrichissement local et clustering de motifsCastro-Mondragon, Jaime 13 July 2017 (has links)
Les facteurs transcriptionnels (TF) sont des protéines qui contrôlent l'expression des gènes. Leurs motifs de liaison (TFBM, également appelés motifs) sont généralement représentés sous forme de matrices de scores spécifiques de positions (PSSM). L'analyse de motifs est utilisée en routine afin de découvrir des facteurs candidats pour la régulation d'un jeu de séquences d'intérêt. L'avénement des méthodes à haut débit a permis de détecter des centaines de motifs, qui sont disponibles dans des bases de données. Durant ma thèse, j'ai développé deux nouvelles méthodes et implémenté des outils logiciels pour le traitement de collections massives de motifs: matrix-clustering regroupe les motifs par similarité; position-scan détecte les motifs présentant des préférences de position relativement à une coordonnée de référence. Les méthodes que j'ai développées ont été évaluées sur base de cas d'études, et utilisées pour extraire de l'information interprétable à partir de différents jeux de données de Drosophila melanogaster et Homo sapiens. Les résultats démontrent la pertinence de ces méthodes pour l'analyse de données à haut débit, et l'intérêt de les intégrer dans des pipelines d'analyse de motifs. / Transcription Factors (TFs) are DNA-binding proteins that control gene expression. TF binding motifs (TFBMs, simply called “motifs”) are usually represented as Position Specific Scoring Matrices (PSSMs), which can be visualized as sequence logos. The advent of high-throughput methods has allowed the detection of thousands of motifs which are usually stored in databases. In this work I developed two novel methods and implemented software tools to handle large collection of motifs in order to extract interpretable information from high-throughput data: (i) matrix-clustering regroups motifs by similarity and offers a dynamic interface; (2) position-scan detects TFBMs with positional preferences relative to a given reference location (e.g. ChIP-seq peaks, transcription start sites). The methods I developed have been evaluated based on control cases, and applied to extract meaningful information from different datasets from Drosophila melanogaster and Homo sapiens. The results show that these methods enable to analyse motifs in high-throughput datasets, and can be integrated in motif analysis workflows.
|
6 |
A Combined Motif Discovery MethodLu, Daming 06 August 2009 (has links)
A central problem in the bioinformatics is to find the binding sites for regulatory motifs. This is a challenging problem that leads us to a platform to apply a variety of data mining methods. In the efforts described here, a combined motif discovery method that uses mutual information and Gibbs sampling was developed. A new scoring schema was introduced with mutual information and joint information content involved. Simulated tempering was embedded into classic Gibbs sampling to avoid local optima. This method was applied to the 18 pieces DNA sequences containing CRP binding sites validated by Stormo and the results were compared with Bioprospector. Based on the results, the new scoring schema can get over the defect that the basic model PWM only contains single positioin information. Simulated tempering proved to be an adaptive adjustment of the search strategy and showed a much increased resistance to local optima.
|
7 |
Eukaryotic transcriptional regulation : from data mining to transcriptional profilingMorgan, Xochitl Chamorro 25 January 2011 (has links)
Survival of cells and organisms requires that each of thousands of genes is expressed at the correct time in development, in the correct tissue, and under the correct conditions. Transcription is the primary point of gene regulation. Genes are activated and repressed by transcription factors, which are proteins that become active through signaling, bind, sometimes cooperatively, to regulatory regions of DNA, and interact with other proteins such as chromatin remodelers. Yeast has nearly six thousand genes, several hundred of which are transcription factors; transcription factors comprise around 2000 of the 22,000 genes in the human genome. When and how these transcription factors are activated, as well as which subsets of genes they regulate, is a current, active area of research essential to understanding the transcriptional regulatory programs of organisms. We approached this problem in two divergent ways: first, an in silico study of human transcription factor combinations, and second, an experimental study of the transcriptional response of yeast mutants deficient in DNA repair. First, in order to better understand the combinatorial nature of transcription factor binding, we developed a data mining approach to assess whether transcription factors whose binding motifs were frequently proximal in the human genome were more likely to interact. We found many instances in the literature in which over-represented transcription factor pairs co-regulated the same gene, so we used co-citation to assess the utility of this method on a larger scale. We determined that over-represented pairs were more likely to be co-cited than would be expected by chance. Because proper repair of DNA is an essential and highly-conserved process in all eukaryotes, we next used cDNA microarrays to measure differentially expressed genes in eighteen yeast deletion strains with sensitivity to the DNA cross-linking agent methyl methane sulfonate (MMS); many of these mutants were transcription factors or DNA-binding proteins. Combining this data with tools such as chromatin immunoprecipitation, gene ontology analysis, expression profile similarity, and motif analysis allowed us to propose a model for the roles of Iki3 and of YML081W, a poorly-characterized gene, in DNA repair. / text
|
8 |
Temporal control of muscle gene expression in an ascidian embryo / ホヤ胚における筋肉で発現する遺伝子の時間的な調節Yu, Deli 23 May 2019 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(理学) / 甲第21946号 / 理博第4524号 / 新制||理||1650(附属図書館) / 京都大学大学院理学研究科生物科学専攻 / (主査)准教授 佐藤 ゆたか, 教授 高橋 淑子, 教授 中務 真人 / 学位規則第4条第1項該当 / Doctor of Science / Kyoto University / DGAM
|
9 |
Enhancer Binding Site Architecture Regulates Cell-specific Notch Signal Strength and TranscriptionKuang, Yi 15 October 2020 (has links)
No description available.
|
10 |
Comparative promoter region analysis powered by CORGDieterich, Christoph, Grossmann, Steffen, Tanzer, Andrea, Röpcke, Stefan, Arndt, Peter F., Stadler, Peter F., Vingron, Martin 11 December 2018 (has links)
Background
Promoters are key players in gene regulation. They receive signals from various sources (e.g. cell surface receptors) and control the level of transcription initiation, which largely determines gene expression. In vertebrates, transcription start sites and surrounding regulatory elements are often poorly defined. To support promoter analysis, we present CORG http://corg.molgen.mpg.de, a framework for studying upstream regions including untranslated exons (5' UTR).
Description
The automated annotation of promoter regions integrates information of two kinds. First, statistically significant cross-species conservation within upstream regions of orthologous genes is detected. Pairwise as well as multiple sequence comparisons are computed. Second, binding site descriptions (position-weight matrices) are employed to predict conserved regulatory elements with a novel approach. Assembled EST sequences and verified transcription start sites are incorporated to distinguish exonic from other sequences.
As of now, we have included 5 species in our analysis pipeline (man, mouse, rat, fugu and zebrafish). We characterized promoter regions of 16,127 groups of orthologous genes. All data are presented in an intuitive way via our web site. Users are free to export data for single genes or access larger data sets via our DAS server http://tomcat.molgen.mpg.de:8080/das. The benefits of our framework are exemplarily shown in the context of phylogenetic profiling of transcription factor binding sites and detection of microRNAs close to transcription start sites of our gene set.
Conclusion
The CORG platform is a versatile tool to support analyses of gene regulation in vertebrate promoter regions. Applications for CORG cover a broad range from studying evolution of DNA binding sites and promoter constitution to the discovery of new regulatory sequence elements (e.g. microRNAs and binding sites).
|
Page generated in 0.2379 seconds