Spelling suggestions: "subject:"transcription start site"" "subject:"transcription start lite""
1 |
Genome-wide analysis of transcription initiation and promoter architecture in eukaryotesRaborn, R. Taylor 01 January 2012 (has links)
The transcriptome represents the entirety of RNA molecules within a cell or tissue at a given time. Recent advances have facilitated the production of large-scale, global interrogations of transcriptomes, finding that genomes are extensively transcribed and contain diverse classes of RNAs (Dinger et al., 2009). Information generated by high-throughput analyses of mRNA transcription start sites (TSSs) such as CAGE (Cap Analysis of Gene Expression) indicate that eukaryotic genomes have complex landscapes of transcription initiation. The TSS is important for the annotation of cis-regulatory sequences, because it provides a link between the mRNA transcript and the promoter. The patterns of TSS distributions observed within mRNA 5' end profiling studies prevent straightforward annotation of putative promoters.
To address this challenge, we developed a method to identify- on a genome-wide basis- the putative promoter, which we define by TSS distributions and designate the transcription start region (TSR). We applied a clustering method to identify and annotate TSRs within the budding yeast Saccharomyces cerevisiae using a full-length cDNA dataset (Miura et al., 2006). To validate these TSR annotations, we performed an integrative genomic analysis using multiple datasets. Our method identified TSRs at positions consistent with bona fide promoters in S. cerevisiae. In addition, using 5'RACE, we find overall agreement between computationally-defined TSRs and TSSs identified experimentally. From this analysis, we find that a significant proportion of genes exhibiting alternative promoter usage within sporulation are associated with respiration, suggesting that this is regulated on a condition-specific basis in budding yeast.
We further developed our TSS clustering method into a bioinformatics tool called TSRchitect, which identifies and annotates TSRs from large-scale TSS profiling information. TSRchitect is capable of handling both tag and sequence-based TSS information and efficiently computes TSRs from global TSS datasets on a desktop computer. We find support for TSRchitect's annotations in human from a CAGE experiment from the ENCODE (Encyclopedia of DNA Elements) project.
Finally, we use TSRchitect to identify TSRs from the transcriptomes of diverse eukaryotes. We investigated the conservation of TSRs among orthologous genes. We frequently identify multiple TSRs for a given gene, suggesting that alternative promoter usage is widespread. Overall, using TSS profiling data derived from separate tissues within mouse and human, we find that the positions of TSRs are relatively stable across tissues surveyed; however, a small fraction of genes exhibit tissue-specific differences in TSR use.
As transcriptome profiling information continues to be generated at an rapid pace, computational approaches are increasingly important. It is anticipated that the method and approach we describe within this dissertation will contribute to an improved of gene regulation and promoter architecture in eukaryotes.
|
2 |
Comparative promoter region analysis powered by CORGDieterich, Christoph, Grossmann, Steffen, Tanzer, Andrea, Röpcke, Stefan, Arndt, Peter F., Stadler, Peter F., Vingron, Martin 11 December 2018 (has links)
Background
Promoters are key players in gene regulation. They receive signals from various sources (e.g. cell surface receptors) and control the level of transcription initiation, which largely determines gene expression. In vertebrates, transcription start sites and surrounding regulatory elements are often poorly defined. To support promoter analysis, we present CORG http://corg.molgen.mpg.de, a framework for studying upstream regions including untranslated exons (5' UTR).
Description
The automated annotation of promoter regions integrates information of two kinds. First, statistically significant cross-species conservation within upstream regions of orthologous genes is detected. Pairwise as well as multiple sequence comparisons are computed. Second, binding site descriptions (position-weight matrices) are employed to predict conserved regulatory elements with a novel approach. Assembled EST sequences and verified transcription start sites are incorporated to distinguish exonic from other sequences.
As of now, we have included 5 species in our analysis pipeline (man, mouse, rat, fugu and zebrafish). We characterized promoter regions of 16,127 groups of orthologous genes. All data are presented in an intuitive way via our web site. Users are free to export data for single genes or access larger data sets via our DAS server http://tomcat.molgen.mpg.de:8080/das. The benefits of our framework are exemplarily shown in the context of phylogenetic profiling of transcription factor binding sites and detection of microRNAs close to transcription start sites of our gene set.
Conclusion
The CORG platform is a versatile tool to support analyses of gene regulation in vertebrate promoter regions. Applications for CORG cover a broad range from studying evolution of DNA binding sites and promoter constitution to the discovery of new regulatory sequence elements (e.g. microRNAs and binding sites).
|
3 |
Bioinformatic prediction of conserved promoters across multiple whole genomes of ChlamydiaGrech, Brian James January 2007 (has links)
The genome sequencing projects have generated a wealth of genomic data and the analysis of this data has provided many interesting findings. However, genome wide analysis of bacteria for promoters has lagged behind, because it has been difficult to accurately predict the promoters with so much background noise that are found in bacterial genomes. One approach to overcome this problem is to predict phylogenetically conserved promoters across multiple genomes of different bacteria, thus filtering out many of the false positives, which are predicted by the current methods. However, there are no programmes capable of doing this. Therefore, the work presented in this thesis has developed a position weight matrix (PWM) based programme called Multiscan that predicts conserved promoters across multiple bacterial genomes. Since Chlamydia is one of the most sequenced bacterial genera and has a high level of conservation of genes and large-scale conservation of gene order between species, Multiscan was developed and tested on Chlamydia. When Multiscan analysed a genome wide dataset of equivalent non-coding regions (NCRs) upstream of genes, from Chlamydia trachomatis, Chlamydia pneumoniae and Chlamydia caviae for σ66 promoters that are phylogenetically conserved, Multiscan predicted 42 promoters. Since only one of the 42 promoters predicted by Multiscan had previously available biological data to confirm its prediction, an additional subset of 10 of the remaining 41 σ66 promoters were analysed in C. trachomatis by mapping the 5' end of the transcripts. The primer extension assay synthesised cDNA products of the correct length for seven of the 10 genes chosen. When the performance of Multiscan was compared to one of the accepted method for genome wide prediction of promoters in bacteria, the "standard PWM method", Multiscan predicted 32 more promoters than the "standard PWM method" in Chlamydia. Furthermore, the promoters predicted by Multiscan were up to three more mismatches from the Escherichia coli σ70 consensus sequence than the promoters predicted by the standard PWM method. Although Multiscan predicted 42 promoters that were well conserved across the three chlamydial species, the analysis was unable to identify the 14 known σ66 promoters in C. trachomatis. These promoters were missed (1) because they were dissimilar to the E. coli σ70 consensus sequence and/or (2) because the promoters were poorly conserved across the three chlamydial species. To address the second possibility, the 14 false negatives were analysed by another phylogenetic footprinting method. Fourteen sets of equivalent NCRs located upstream of the homologous genes from the three chlamydiae were aligned with the computer programme Clustal W and the alignment analysed "by eye" for evidence of phylogenetic footprints containing the 14 false negatives. The analysis identified that seven of the 14 false negatives were poorly conserved across the chlamydial species. Analysis of two of the seven promoters that could not be footprinted, the promoters of ltuA and ltuB, by mapping the transcriptional start sites in C. caviae, confirmed their poor conservation across C. trachomatis and C. caviae. This analysis showed that substantial differences exist in chlamydial σ66 promoters from equivalent NCRs upstream of genes. This study has developed a new computer programme for genome wide prediction of promoters that are phylogenetically conserved and has shown the value of this programme by identifying seven new well conserved promoters and seven candidate poorly conserved promoters in Chlamydia.
|
4 |
In silico investigation of glossina morsitans promotersMwangi, Sarah Wambui January 2013 (has links)
Philosophiae Doctor - PhD / Tsetse flies (Glossina spp) are the biological vectors for Trypanosomes, the causative magents of Human African Trypanosomiasis (HAT). HAT is a debilitating disease that continues to present a major public health problem and a key factor limiting rural development in vast regions of tropical Africa. To augment vector control efforts, the International Glossina Genome Initiative (IGGI) was established in 2004 with the ultimate goal of generating a fully annotated whole genome sequence for Glossina morsitans. A working draft genome of Glossina morsitans was availed in 2011. In this thesis, transcriptional regulatory features in Glossina morsitans were analysed using the draft genome. A method for TSS identification in the newly sequenced Glossina morsitans genome was developed using TSS-seq tags sampled from two developmental stages of Glossina morsitans. High throughput next generation sequencing reads obtained from Glossina morsitans larvae and pupae were used to locate transcription start sites (TSS) in the Glossina morsitans genome. TSS-seq tag clusters, defined as a minimum number of reads at the 5’ predicted UTR or first coding exon, were used to define transcription
start sites. A total of 3134 tag clusters were identified on the Glossina genome. Approximately 45.4% (1424) of the tag clusters mapped to the first coding exons or their proximal predicted 5’UTR regions and include 31 tag clusters that mapped to transposons. A total of 1101 (35.1%) tag clusters mapped outside the genic region and/or scaffolds without gene predictions and may correspond to previously un-annotated transcripts or noncoding RNA TSS. The core promoter regions were classified as narrow or broad based on the number of TSS positions within a TSS-seq cluster. Majority (95%) of the core promoters analysed in this study were of the broad type while only 5% were of the narrow type. Comparison of canonical core promoter motif occurences between random and bona fide core promoters showed that, generally, the number of motifs in biologically functional genomic windows in the true dataset exceeded those in the random dataset (p <= 0.00164, 0.00135, 0.00185 for the narrow, broad with peak and broad without peak categories respectively). Frequency of motif co-occurrence in core promoter was
found to be fundamentally different across various initiation patterns. Narrow core
promoters recorded higher frequency of the TATA-box and INR motifs and two-way
motif co-occurrence showed that the TATA-box-INR pair is over-represented in the
narrow category. Broad core promoters showed higher frequency of the BREd and
MTE motifs and two-way motif co-occurrence showed that the MTE-DPE pair is
over-represented in broad core promoters. TATA-less promoters account for 77% of the core promoters in this analysis. TATA-less core promoters showed a higher frequency of the MTE and INR motifs in contrast to observations in Drosophila where the DPE motif has been reported to occur frequently in TATA-less promoters. These motif combinations suggest their equal importance to transcription in their corresponding promoter classes in Glossina morsitans.
|
5 |
RNA-Seq and proteomics based analysis of regulatory RNA features and gene expression in Bacillus licheniformisWiegand, Sandra 25 September 2013 (has links)
No description available.
|
Page generated in 0.0803 seconds